Cover of The Coming Wave by Mustafa Suleyman - Business and Economics Book

From "The Coming Wave"

Author: Mustafa Suleyman
Publisher: Crown
Year: 2023
Category: Technology & Engineering

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

Create Free Account Sign In

🎧 Listen to Summary

Free 10-min Preview

0:00

Speed:

10:00 free remaining

Chapter 14: Ten Steps Toward Containment

Key Insight 2 from this chapter

External Oversight and Auditing Mechanisms

Key Insight

Effective containment of technology relies on meaningful oversight and enforceable rules, verified through robust audits to ensure technical safety advances and regulations operate as intended. Trust in these systems mandates transparency and the ability to verify their safety, integrity, and uncompromised nature at every level. This necessitates strong access rights, robust audit capacity, and 'adversarial testing' by 'white hat hackers' or even other AIs to identify weaknesses, flaws, and biases. Currently, there is a global deficit in formal, routine efforts to test deployed systems, an absence of an early warning system for technological risks, and a lack of standardized assessments for regulatory compliance. A practical first step involves leading companies and researchers proactively collaborating with government-led experts in auditing their work. The Partnership on AI, supported by major technology companies and civil society groups, created an AI Incidents Database that has collected over 1,200 reports, facilitating confidential sharing of safety lessons among developers and fostering interdisciplinary discussion.

'Red teaming' is a crucial method involving proactive flaw detection in AI models or software by deliberately attacking systems in controlled ways to uncover vulnerabilities and failure modes. These identified weaknesses, likely to intensify in future systems, allow for safeguards to be integrated during development. Public and collective 'red teaming' efforts are encouraged for mutual learning among developers, mirroring the cybersecurity industry's practice of sharing insights on new threats. Government-funded 'red teams' are also needed to rigorously stress test every system and disseminate discovered insights widely. Eventually, this work could be scaled and automated using publicly mandated AI systems designed to audit other AIs while also being auditable themselves. Monitoring systems must detect anomalies, unforeseen capability jumps, hidden failure modes, and 'Trojan attacks' by tracking metrics without resorting to a panopticon approach, initially focusing on non-invasive methods like scrutinizing open-source data sets, research bibliometrics, and publicly reported harmful incidents. Furthermore, APIs for foundational AI services should not be openly accessible without 'know your customer' checks, akin to practices in the banking sector.

Targeted oversight mechanisms, known as 'scalable supervision,' are proposed for systems potentially outperforming human capabilities on critical tasks. These involve mathematically verifying algorithms' non-harmful nature by requiring strict proofs that ensure actions or outputs are demonstrably constrained, thus embedding guaranteed activity records and capability limits. SecureDNA, a not-for-profit initiative, offers a promising example in biotechnology by aiming to connect all global DNA synthesizers to a centralized, secure, encrypted system for real-time scanning of pathogenic sequences, flagging potentially harmful prints to significantly reduce bio-risk. This method, combined with pre-vetting DNA synthesis or AI data inputs, front-loads audits before deployment. Existing global approaches to monitoring emerging technologies and their misuse are inconsistent and often opaque, necessitating a well-defined legal framework for checking new technologies 'under the hood'—in the code, lab, factory, or field—with mandatory transparency. While voluntary collaboration with tech producers is preferred, legislation must enforce cooperation, and in some cases, technical safeguards like encrypted backdoors, controlled by the judiciary or an equivalent independent body, may be considered for verifiable access by law enforcement or regulators, alongside cryptographic ledgers to track the proliferation and use of models or systems.

📚 Continue Your Learning Journey — No Payment Required

Access the complete The Coming Wave summary with audio narration, key takeaways, and actionable insights from Mustafa Suleyman.

📖 Read Full Summary 🔍 Explore More Books