Advancing our understanding of AI cognition, consciousness, and moral status
Biological naturalism holds that biology is necessary for consciousness, rejecting the computational functionalist view that implementing the right computations alone is sufficient. This report analyzes recent work on biological naturalism as a research program, identifying key open questions: how the view should be defined, how the dialectical stalemate with computational functionalism can be resolved, how biological naturalism can be empirically supported, and how explanatory bridges between biological properties and consciousness can inform assessments of AI consciousness.
The Digital Consciousness Model (DCM) is a first attempt to assess the evidence for consciousness in AI systems in a systematic, probabilistic way. It provides a shared framework for comparing different AIs and biological organisms, and for tracking how the evidence changes over time as AI develops. Instead of adopting a single theory of consciousness, it incorporates a range of leading theories and perspectives—acknowledging that experts disagree fundamentally about what consciousness is and what conditions are necessary for it.
Agentic AIs raise critical questions about aligning risk attitudes with users, developers, and society. This collection of three papers examines models of user-AI relationships, explores developer liability and shared responsibility, and evaluates technical methods for calibrating AI systems to users' risk preferences through imitation learning, prompting, and preference modeling.
This report estimates the future population of digital minds—AI systems with agency, personality, and intelligence that merit moral consideration—using two approaches. The first predicts adoption rates across specific use cases, while the second analyzes trends in AI chip production and efficiency, together capturing supply and demand dynamics. Both approaches combine speculative estimates within formal structures to project digital mind populations in coming decades.
Evaluating AI systems for consciousness and other morally relevant properties requires understanding internal architectures, not just behavior. Three industry trends threaten such evaluations: increasing secrecy from competitive pressures, rapid exploration of new architectures, and AI-driven innovation creating complexity beyond human comprehension. These trends may leave experts without adequate access to assess future systems' moral status.
Recent experiments on LLM introspection show suggestive results, but alternative explanations warrant skepticism. LLMs lack training incentives for introspection, any such abilities likely wouldn't generalize, and models have no clear basis for self-identification. Examining experimental paradigms like token counting, self-description, and activation patching reveals behaviors explainable without metacognitive representations. While introspection remains theoretically possible through deliberate training, current evidence doesn't clearly demonstrate robust introspective abilities in existing models.