// GLOSSARY -- AGENTIC AI

What is Constitutional AI?

1 min read Updated Feb 19, 2026

Constitutional AI (CAI) is Anthropic's alignment methodology where AI behavior is guided by a written set of principles (a 'constitution') that the model uses to self-evaluate and improve its responses during training.

WHY IT MATTERS

Constitutional AI addresses a fundamental challenge: how do you align AI behavior at scale without labeling millions of examples? CAI's approach: give the model principles (a constitution) and have it critique and revise its own outputs according to those principles.

The process: the model generates responses, then evaluates them against the constitution ('Is this response helpful? Does it avoid harm?'), revises to better satisfy the principles, and this self-revised data is used for RLHF training.

CAI is significant because it reduces dependence on human feedback for every edge case, scales more efficiently, and makes the alignment criteria explicit and auditable — you can read the constitution.

FREQUENTLY ASKED QUESTIONS

What's in the constitution?

Principles about helpfulness, harmlessness, and honesty. Examples: 'Choose the response that is least likely to cause harm,' 'Choose the response that is most helpful while being honest about uncertainty.'

Is CAI better than RLHF?

CAI uses RLHF but with AI-generated feedback based on principles, rather than purely human feedback. It's more scalable and more transparent about alignment criteria.

Can CAI prevent all harmful outputs?

No. Like all alignment techniques, CAI improves behavior probabilistically. Edge cases, novel attacks, and distribution shift can still produce undesired outputs.

What is Constitutional AI?

WHY IT MATTERS

FREQUENTLY ASKED QUESTIONS

FURTHER READING

Take your agents live. Without losing control.

What is Constitutional AI?

WHY IT MATTERS

FREQUENTLY ASKED QUESTIONS

RELATED TERMS

FURTHER READING

Take your agents live. Without losing control.