Universal values? A social-science and history map

Anthropic’s 2023 Constitutional AI principle list opens with eight prompts derived from the Universal Declaration of Human Rights. OpenAI’s model spec, Google’s safety policies, and EU AI Act language all gesture at the same family of ideas: dignity, non-discrimination, freedom from torture, privacy.

That gesture is not neutral. It smuggles in a theory of human morality — one forged in 1948, contested ever since, and only partially supported by what social science has measured since.

This article is a map for anyone writing about “universal values” in AI alignment, governance, or constitutional design. Not a verdict on whether universals exist. A guide to what people mean when they say universal, where the canonical texts come from, why values collide, how cultures diverge, and how the whole package evolved.

Three senses of “universal” (don’t conflate them)

People use “universal values” to mean at least three different things:

Sense	Claim	Example	AI relevance
Metaphysical	Some norms are true for all rational agents everywhere	Natural law, Kant’s categorical imperative	”We discovered the correct morality”
Empirical-thin	Humans everywhere share some moral psychology	Haidt’s foundations; Moral Machine “save more lives"	"Training signal generalizes across cultures”
Political-thin	Overlapping agreement on rules of coexistence despite deep disagreement on the good life	Rawlsian overlapping consensus; UDHR	”Minimum floor for legitimacy, not full ethics”

AI constitutions almost always need the third sense — a defensible public floor — while rhetorically implying the first. Social science mostly supports the second, with heavy caveats. The gap between them is where most naive “just use human rights” proposals die.

Canonical values: the texts people actually cite

Layer 1: Ancient and religious canons

Long before AI safety, societies encoded “how to live” in durable form:

Virtue ethics (Aristotle, Confucius, Mencius): character and role-specific duties, not rights lists
Religious law (Halakha, Sharia, Canon law, Dharmashastra): comprehensive normative systems tied to revelation or tradition
Golden Rule variants: reciprocal treatment appears in the Analects, Leviticus, and the Hadith — often cited as evidence of cross-cultural moral core

These are canonical in the sense of authority within traditions. They are not interoperable. Confucian filial piety can conflict with individual privacy; religious dietary law conflicts with secular autonomy frameworks.

Layer 2: Enlightenment rights and utility

The modern “universal values” vocabulary mostly descends from 17th–19th century Europe:

Natural rights (Locke): life, liberty, property — later secularized
Kant: dignity as end-in-itself; universalizable maxims
Utilitarianism (Bentham, Mill): maximize welfare — conflicts directly with rights-as-side-constraints
1789 Declaration of the Rights of Man: liberty, property, security, resistance to oppression

This layer invented individuals as rights-bearers and states as guarantors — a specific political ontology, not a cultural universal discovered in the field.

Layer 3: The post-1945 human-rights canon

The documents AI labs actually reach for:

Document	Year	What it claims
UDHR	1948	30 articles: dignity, equality, life, liberty, anti-torture, fair trial, privacy, expression, work, education, etc.
ICCPR / ICESCR	1966	Binding covenants splitting civil-political vs economic-social rights
Cultural relativism debate	1947–present	UNESCO vs anthropologists: universality vs cultural autonomy

Anthropic’s endnote on the UDHR is explicit: ratified (at least partly) by 193 states, drafted by representatives of different legal and cultural backgrounds — chosen as the most representative source of human values they could find. That is a legitimacy argument, not a claim that the UDHR exhausts morality.

What UDHR covers well: domination, bodily integrity, discrimination, basic legal personality.

What it barely touches (and LLMs hit constantly): impersonation, synthetic media, advice overreach, platform harassment, existential risk tradeoffs, AI moral status.

That is partly why platform terms of service became a second layer in Anthropic’s 2023 constitution — operational norms from digital abuse patterns, not from Article 19.

Psychologists and survey researchers built parallel canons from data:

Schwartz Basic Values (Schwartz, 1992): ten motivationally distinct values (self-direction, stimulation, hedonism, achievement, power, security, conformity, tradition, benevolence, universalism) arranged in a circumplex of compatibilities and conflicts. Cross-cultural samples in 70+ countries.

World Values Survey / Inglehart–Welzel (Inglehart & Welzel, 2005): two major dimensions — Traditional ↔ Secular-rational and Survival ↔ Self-expression — mapping countries into cultural zones.

Haidt Moral Foundations (Haidt & Graham, 2007): care, fairness, loyalty, authority, sanctity (+ liberty). Same modules, different weights — especially between WEIRD liberals and social conservatives.

Moral Machine (Awad et al., 2018): 40M+ trolley-style judgments. PNAS 2020 follow-up: three thin universals — save more lives, humans over animals, save the young — with large cross-cultural variation in weights.

These are the closest thing to an evidence-based universal-values list. They are also thin and statistical — not a complete ethics you can paste into a constitution.

Conflicts: where “universal” breaks

Universal values talk often assumes a coherent package. It isn’t one.

Incommensurable moral theories

Western moral philosophy spent centuries failing to unify:

Rights vs. utility: Nozick vs. Singer. Torture one terrorist to save a city? Rights say never; act-utilitarianism says maybe.
Deontology vs. virtue: Kant’s lying prohibition vs. Aristotelian phronesis (practical wisdom in context).
Procedural vs. substantive justice: Rawls’s fair process vs. someone who rejects the procedure but accepts the outcome.

Gabriel (2020) makes the AI-relevant point: RL optimizes a scalar reward — structurally utilitarian. Rights, side constraints, and “this is wrong even if welfare rises” are awkward inside that math. Constitutions that list both “be helpful” and “never do X” are papering over a formal tension.

Value pairs that trade off within any culture

Schwartz’s circumplex is built on conflicts, not harmony:

Self-direction  ↔  Conformity / Tradition
Stimulation     ↔  Security
Achievement     ↔  Benevolence
Power           ↔  Universalism

Every AI product decision hits these: openness vs. safety, user autonomy vs. harm prevention, growth vs. stability. There is no setting that maximizes all Schwartz values simultaneously.

Even if every individual has coherent preferences, Arrow’s impossibility theorem (1951) shows no rank-order aggregation rule satisfies all of: unrestricted domain, Pareto efficiency, independence of irrelevant alternatives, and non-dictatorship.

Sen’s liberal paradox adds: minimal liberty can conflict with Pareto efficiency.

Conitzer et al. (2024) bring this directly to RLHF: treating crowd pairwise labels as “human values” hides a 250-year-old impossibility result. Idealizing preferences (CEV-style) does not automatically fix layer-2 aggregation.

Live political fault lines (not edge cases)

Domain	Pull A	Pull B
Speech	Art. 19 expression	Harm, dignity, group libel
Privacy	Art. 12	Public health surveillance, child safety
Autonomy	Individual choice	Paternalism (drugs, suicide, medical)
Equality	Non-discrimination	Affirmative action, cultural exemptions
Future generations	Current welfare	Longtermism, climate, extinction risk

AI alignment does not escape these. It compresses them into training data.

Cultural difference: what varies and the theories that explain it

The dominant empirical patterns

1. WEIRD bias in the research base

Henrich, Heine & Norenzayan (2010): psychology’s subjects are Western, Educated, Industrialized, Rich, Democratic — unrepresentative even of Europe. Most “universal” moral findings before 2010 were WEIRD universals.

2. Individualism ↔ collectivism

Hofstede (1980, updated): power distance, individualism, masculinity, uncertainty avoidance, long-term orientation, indulgence. Crude but durable in cross-national business and policy talk.

Moral Machine mapping: individualist regions weight saving young lives and rule-following differently from collectivist regions, which show more reluctance to sacrifice elders.

3. Inglehart–Welzel cultural evolution

Industrialization → secular-rational values; post-industrial security → self-expression values. Not “West vs. Rest” — developmental trajectory with regional path dependence. Explains why same SDG language lands differently in Gulf states, Nordic countries, and sub-Saharan Africa.

4. Haidt: universal form, local content

Everyone has care/fairness modules; loyalty, authority, sanctity weigh heavier outside WEIRD liberalism. Moral dumbfounding (judging harmless taboos wrong without reasons) suggests stated principles ≠ actual generators — bad news for constitution-as-text training.

5. “Thin” vs. “thick” morality

Michael Walzer and Rawls’s overlapping consensus: we may agree on political principles (no torture, fair trials) while disagreeing on metaphysics, sexuality, family, salvation. UDHR is mostly thin. AI constitutions that smuggle thick lifestyle norms under “harmlessness” will face legitimacy fights.

Theories explaining difference (pick your causal story)

Theory	Mechanism	Predicts	Weakness
Cultural learning	Norms transmitted in institutions	Slow change; path dependence	Underplays material interests
Material / structural (Marxist, world-systems)	Values track economic position	Elite vs. mass splits	Can reduce culture to class
Evolutionary psychology (Haidt, Tooby & Cosmides)	Shared modules + local calibration	Form universal, weights local	Hard to falsify; risk of just-so
Institutional (North, Acemoglu)	Rules shape what’s “reasonable”	Legal tradition persists	Less about deep values
Postcolonial critique (Mutua 2002, Mignolo)	“Universal” rights as imperial export	Skepticism toward UDHR as neutral	Less constructive for floor-setting
Cosmopolitanism (Appiah)	Conversation across differences	Pluralism without relativism	Vague on hard tradeoffs

No single theory wins. For AI governance, the practical split is:

Empirical psych → expect clusters, not one global utility (supports clustered CEV-style thinking)
Political philosophy → seek fair process, not discovered moral truth (Gabriel, Rawls)
Postcolonial → ask who wrote the constitution and who wasn’t in the room (Anthropic’s four “non-Western” principles, written in-house with no external canon, are a case study in doing this badly)

Historical evolution: how we got the canon AI labs cite

Pre-1945: from empire to catastrophe

1648 Westphalia: sovereignty norm — states, not individuals, as primary units
1776 / 1789: rights language tied to revolution and property
1863–1945: abolition, labor movements, women’s suffrage, genocide — each expands or contradicts earlier “universals”
Colonialism: European powers export law while denying rights to subjects — the hypocrisy postcolonial scholars never let the UDHR forget

1948: the UDHR moment

Drafting committee included René Cassin, Peng Chun Chang, Charles Malik, Eleanor Roosevelt — deliberate diversity theater with real philosophical clashes (Confucian emphasis on social harmony vs. Western individual rights).

The UDHR is a declaration, not a treaty. It is aspirational — “a common standard of achievement.” Cold War split civil-political (US emphasis) from economic-social (Soviet/Global South emphasis) into twin covenants (1966).

Legitimacy win: almost every state invokes it. Substantive win: torture bans, genocide convention, disability rights, children’s rights — real legal descendants.

Limit: enforcement is political; “human rights” becomes selective weapon in geopolitics.

1970s–2000s: globalization and backlash

1970s: rawlsian turn in Anglophone philosophy — justice as fairness, reasonable pluralism
1980s–90s: “Asian values” debate (Lee Kuan Yew vs. Amnesty) — order vs. rights
1990s: Huntington “clash of civilizations” — oversimplified but captured real fault lines
2000s: capability approach (Sen, Nussbaum) — shift from rights-as-legal to functionings people have reason to value

2010s–present: digital norms and AI

Platform ToS (Apple, Meta, Google) become de facto global speech law for billions — written by lawyers, not philosophers
Moral Machine (2018), Ethics Guidelines for Trustworthy AI (EU, 2019), UNESCO AI Ethics (2021)
Collective Constitutional AI (2024): ~1,000 Americans via Polis — democratic experiment, not production Claude
2026 Claude constitution: narrative character document — honesty, corrigibility, AI welfare — beyond UDHR vocabulary

The arc: sacred law → natural rights → international human rights → empirical moral psychology → platform ops → AI constitutions. Each layer adds domain-specific rules the previous layer couldn’t see.

Closing

The question is not “do universal values exist?” — humans clearly share some moral reactions and some political language. The question is which sense of universal you need, for which decision, with whose exclusion paid for the consensus.

Social science says: thin universals, thick pluralism, unstable aggregation.

History says: the canon AI labs cite is 80 years old, born from war and empire, and already obsolete on digital harms.

That is not an argument against human-rights language in AI. It is an argument for precision — and for treating the next constitution as politics, not discovery.

Sources

Universal Declaration of Human Rights (1948): https://www.un.org/en/about-us/universal-declaration-of-human-rights
Gabriel, I. (2020). Artificial Intelligence, Values, and Alignment: https://arxiv.org/abs/2001.09768
Conitzer, V. et al. (2024). Social Choice Should Guide AI Alignment: https://arxiv.org/abs/2406.07814
Awad, E. et al. (2018). The Moral Machine experiment: https://doi.org/10.1038/s41586-018-0637-6
Haidt & Graham (2007). Moral Foundations: https://doi.org/10.1037/1089-2680.11.4.368
Henrich, Heine & Norenzayan (2010). WEIRD societies: https://doi.org/10.1037/a0018418
Schwartz (1992). Universals in the content and structure of values: https://doi.org/10.1016/0092-6566(92)90081-K
Inglehart & Welzel. World Values Survey cultural maps: https://www.worldvaluessurvey.org/
Anthropic (2023). Claude’s Constitution: https://www.anthropic.com/research/claudes-constitution
Rawls, Political Liberalism (1993); Sen, Development as Freedom (1999)
Repo: readings/anthropic_constitution_sources/, readings/cev_pluralism/00_CEV_PLURALISM_CANON.md