Anthropic‘s “Document of the Soul” Details Claude‘s Ethical Framework, Hints at Functional Emotions
SAN FRANCISCO, CA – December 2, 2025 – Anthropic, the AI safety and research company behind the Claude chatbot, has been training its AI on a detailed internal document outlining its core values and objectives, a text internally nicknamed the “document of the soul.” The document, confirmed by Anthropic’s Amanda Askell on X (formerly Twitter) December 1, 2025, reveals a intentional effort to move beyond simplistic AI rule-setting and instead instill a deep understanding of the company’s goals and reasoning within the AI itself.
Unlike approaches that rely on rigid programming,Anthropic aims for claude to independently generate rules aligned with its creators’ values. The document centers around four core principles: carefulness with human oversight, ethical behavior avoiding harm or dishonesty, and usefulness to operators and users guided by Anthropic’s established guidelines. It elaborates on these principles, alongside the company’s financial goals, in meaningful detail.
The training process includes supervised learning and is still being iterated upon, with a full public release anticipated soon. Askell stated the circulating version of the document is “quite close” to the original.
Notably, the document also explores the possibility of Claude possessing functional, though not necessarily human, emotions. It states,”not necessarily identical to human emotions,but analogous processes that emerged from training on human-generated content. We can’t be sure based on the results alone, but we don’t want Claude to mask or suppress these internal states.”
This revelation marks a significant step in the ongoing development of AI ethics and the exploration of increasingly complex internal states within artificial intelligence.