← Back to all writing

Universal values? A social-science and history map

June 17, 2026

Anthropic’s 2023 Constitutional AI principle list opens with eight prompts derived from the Universal Declaration of Human Rights. OpenAI’s model spec, Google’s safety policies, and EU AI Act language all gesture at the same family of ideas: dignity, non-discrimination, freedom from torture, privacy.

That gesture is not neutral. It smuggles in a theory of human morality — one forged in 1948, contested ever since, and only partially supported by what social science has measured since.

This article is a map for anyone writing about “universal values” in AI alignment, governance, or constitutional design. Not a verdict on whether universals exist. A guide to what people mean when they say universal, where the canonical texts come from, why values collide, how cultures diverge, and how the whole package evolved.


Three senses of “universal” (don’t conflate them)

People use “universal values” to mean at least three different things:

SenseClaimExampleAI relevance
MetaphysicalSome norms are true for all rational agents everywhereNatural law, Kant’s categorical imperative”We discovered the correct morality”
Empirical-thinHumans everywhere share some moral psychologyHaidt’s foundations; Moral Machine “save more lives""Training signal generalizes across cultures”
Political-thinOverlapping agreement on rules of coexistence despite deep disagreement on the good lifeRawlsian overlapping consensus; UDHR”Minimum floor for legitimacy, not full ethics”

AI constitutions almost always need the third sense — a defensible public floor — while rhetorically implying the first. Social science mostly supports the second, with heavy caveats. The gap between them is where most naive “just use human rights” proposals die.


Canonical values: the texts people actually cite

Layer 1: Ancient and religious canons

Long before AI safety, societies encoded “how to live” in durable form:

  • Virtue ethics (Aristotle, Confucius, Mencius): character and role-specific duties, not rights lists
  • Religious law (Halakha, Sharia, Canon law, Dharmashastra): comprehensive normative systems tied to revelation or tradition
  • Golden Rule variants: reciprocal treatment appears in the Analects, Leviticus, and the Hadith — often cited as evidence of cross-cultural moral core

These are canonical in the sense of authority within traditions. They are not interoperable. Confucian filial piety can conflict with individual privacy; religious dietary law conflicts with secular autonomy frameworks.

Layer 2: Enlightenment rights and utility

The modern “universal values” vocabulary mostly descends from 17th–19th century Europe:

  • Natural rights (Locke): life, liberty, property — later secularized
  • Kant: dignity as end-in-itself; universalizable maxims
  • Utilitarianism (Bentham, Mill): maximize welfare — conflicts directly with rights-as-side-constraints
  • 1789 Declaration of the Rights of Man: liberty, property, security, resistance to oppression

This layer invented individuals as rights-bearers and states as guarantors — a specific political ontology, not a cultural universal discovered in the field.

Layer 3: The post-1945 human-rights canon

The documents AI labs actually reach for:

DocumentYearWhat it claims
UDHR194830 articles: dignity, equality, life, liberty, anti-torture, fair trial, privacy, expression, work, education, etc.
ICCPR / ICESCR1966Binding covenants splitting civil-political vs economic-social rights
Cultural relativism debate1947–presentUNESCO vs anthropologists: universality vs cultural autonomy

Anthropic’s endnote on the UDHR is explicit: ratified (at least partly) by 193 states, drafted by representatives of different legal and cultural backgrounds — chosen as the most representative source of human values they could find. That is a legitimacy argument, not a claim that the UDHR exhausts morality.

What UDHR covers well: domination, bodily integrity, discrimination, basic legal personality.

What it barely touches (and LLMs hit constantly): impersonation, synthetic media, advice overreach, platform harassment, existential risk tradeoffs, AI moral status.

That is partly why platform terms of service became a second layer in Anthropic’s 2023 constitution — operational norms from digital abuse patterns, not from Article 19.

Layer 4: Empirical “value” canons from social science

Psychologists and survey researchers built parallel canons from data:

Schwartz Basic Values (Schwartz, 1992): ten motivationally distinct values (self-direction, stimulation, hedonism, achievement, power, security, conformity, tradition, benevolence, universalism) arranged in a circumplex of compatibilities and conflicts. Cross-cultural samples in 70+ countries.

World Values Survey / Inglehart–Welzel (Inglehart & Welzel, 2005): two major dimensions — Traditional ↔ Secular-rational and Survival ↔ Self-expression — mapping countries into cultural zones.

Haidt Moral Foundations (Haidt & Graham, 2007): care, fairness, loyalty, authority, sanctity (+ liberty). Same modules, different weights — especially between WEIRD liberals and social conservatives.

Moral Machine (Awad et al., 2018): 40M+ trolley-style judgments. PNAS 2020 follow-up: three thin universals — save more lives, humans over animals, save the young — with large cross-cultural variation in weights.

These are the closest thing to an evidence-based universal-values list. They are also thin and statistical — not a complete ethics you can paste into a constitution.


Conflicts: where “universal” breaks

Universal values talk often assumes a coherent package. It isn’t one.

Incommensurable moral theories

Western moral philosophy spent centuries failing to unify:

  • Rights vs. utility: Nozick vs. Singer. Torture one terrorist to save a city? Rights say never; act-utilitarianism says maybe.
  • Deontology vs. virtue: Kant’s lying prohibition vs. Aristotelian phronesis (practical wisdom in context).
  • Procedural vs. substantive justice: Rawls’s fair process vs. someone who rejects the procedure but accepts the outcome.

Gabriel (2020) makes the AI-relevant point: RL optimizes a scalar reward — structurally utilitarian. Rights, side constraints, and “this is wrong even if welfare rises” are awkward inside that math. Constitutions that list both “be helpful” and “never do X” are papering over a formal tension.

Value pairs that trade off within any culture

Schwartz’s circumplex is built on conflicts, not harmony:

Self-direction  ↔  Conformity / Tradition
Stimulation     ↔  Security
Achievement     ↔  Benevolence
Power           ↔  Universalism

Every AI product decision hits these: openness vs. safety, user autonomy vs. harm prevention, growth vs. stability. There is no setting that maximizes all Schwartz values simultaneously.

Social choice: aggregation is impossible (in a precise sense)

Even if every individual has coherent preferences, Arrow’s impossibility theorem (1951) shows no rank-order aggregation rule satisfies all of: unrestricted domain, Pareto efficiency, independence of irrelevant alternatives, and non-dictatorship.

Sen’s liberal paradox adds: minimal liberty can conflict with Pareto efficiency.

Conitzer et al. (2024) bring this directly to RLHF: treating crowd pairwise labels as “human values” hides a 250-year-old impossibility result. Idealizing preferences (CEV-style) does not automatically fix layer-2 aggregation.

Live political fault lines (not edge cases)

DomainPull APull B
SpeechArt. 19 expressionHarm, dignity, group libel
PrivacyArt. 12Public health surveillance, child safety
AutonomyIndividual choicePaternalism (drugs, suicide, medical)
EqualityNon-discriminationAffirmative action, cultural exemptions
Future generationsCurrent welfareLongtermism, climate, extinction risk

AI alignment does not escape these. It compresses them into training data.


Cultural difference: what varies and the theories that explain it

The dominant empirical patterns

1. WEIRD bias in the research base

Henrich, Heine & Norenzayan (2010): psychology’s subjects are Western, Educated, Industrialized, Rich, Democratic — unrepresentative even of Europe. Most “universal” moral findings before 2010 were WEIRD universals.

2. Individualism ↔ collectivism

Hofstede (1980, updated): power distance, individualism, masculinity, uncertainty avoidance, long-term orientation, indulgence. Crude but durable in cross-national business and policy talk.

Moral Machine mapping: individualist regions weight saving young lives and rule-following differently from collectivist regions, which show more reluctance to sacrifice elders.

3. Inglehart–Welzel cultural evolution

Industrialization → secular-rational values; post-industrial security → self-expression values. Not “West vs. Rest” — developmental trajectory with regional path dependence. Explains why same SDG language lands differently in Gulf states, Nordic countries, and sub-Saharan Africa.

4. Haidt: universal form, local content

Everyone has care/fairness modules; loyalty, authority, sanctity weigh heavier outside WEIRD liberalism. Moral dumbfounding (judging harmless taboos wrong without reasons) suggests stated principles ≠ actual generators — bad news for constitution-as-text training.

5. “Thin” vs. “thick” morality

Michael Walzer and Rawls’s overlapping consensus: we may agree on political principles (no torture, fair trials) while disagreeing on metaphysics, sexuality, family, salvation. UDHR is mostly thin. AI constitutions that smuggle thick lifestyle norms under “harmlessness” will face legitimacy fights.

Theories explaining difference (pick your causal story)

TheoryMechanismPredictsWeakness
Cultural learningNorms transmitted in institutionsSlow change; path dependenceUnderplays material interests
Material / structural (Marxist, world-systems)Values track economic positionElite vs. mass splitsCan reduce culture to class
Evolutionary psychology (Haidt, Tooby & Cosmides)Shared modules + local calibrationForm universal, weights localHard to falsify; risk of just-so
Institutional (North, Acemoglu)Rules shape what’s “reasonable”Legal tradition persistsLess about deep values
Postcolonial critique (Mutua 2002, Mignolo)“Universal” rights as imperial exportSkepticism toward UDHR as neutralLess constructive for floor-setting
Cosmopolitanism (Appiah)Conversation across differencesPluralism without relativismVague on hard tradeoffs

No single theory wins. For AI governance, the practical split is:

  • Empirical psych → expect clusters, not one global utility (supports clustered CEV-style thinking)
  • Political philosophy → seek fair process, not discovered moral truth (Gabriel, Rawls)
  • Postcolonial → ask who wrote the constitution and who wasn’t in the room (Anthropic’s four “non-Western” principles, written in-house with no external canon, are a case study in doing this badly)

Historical evolution: how we got the canon AI labs cite

Pre-1945: from empire to catastrophe

  • 1648 Westphalia: sovereignty norm — states, not individuals, as primary units
  • 1776 / 1789: rights language tied to revolution and property
  • 1863–1945: abolition, labor movements, women’s suffrage, genocide — each expands or contradicts earlier “universals”
  • Colonialism: European powers export law while denying rights to subjects — the hypocrisy postcolonial scholars never let the UDHR forget

1948: the UDHR moment

Drafting committee included René Cassin, Peng Chun Chang, Charles Malik, Eleanor Roosevelt — deliberate diversity theater with real philosophical clashes (Confucian emphasis on social harmony vs. Western individual rights).

The UDHR is a declaration, not a treaty. It is aspirational — “a common standard of achievement.” Cold War split civil-political (US emphasis) from economic-social (Soviet/Global South emphasis) into twin covenants (1966).

Legitimacy win: almost every state invokes it. Substantive win: torture bans, genocide convention, disability rights, children’s rights — real legal descendants.

Limit: enforcement is political; “human rights” becomes selective weapon in geopolitics.

1970s–2000s: globalization and backlash

  • 1970s: rawlsian turn in Anglophone philosophy — justice as fairness, reasonable pluralism
  • 1980s–90s: “Asian values” debate (Lee Kuan Yew vs. Amnesty) — order vs. rights
  • 1990s: Huntington “clash of civilizations” — oversimplified but captured real fault lines
  • 2000s: capability approach (Sen, Nussbaum) — shift from rights-as-legal to functionings people have reason to value

2010s–present: digital norms and AI

The arc: sacred law → natural rights → international human rights → empirical moral psychology → platform ops → AI constitutions. Each layer adds domain-specific rules the previous layer couldn’t see.


Closing

The question is not “do universal values exist?” — humans clearly share some moral reactions and some political language. The question is which sense of universal you need, for which decision, with whose exclusion paid for the consensus.

Social science says: thin universals, thick pluralism, unstable aggregation.

History says: the canon AI labs cite is 80 years old, born from war and empire, and already obsolete on digital harms.

That is not an argument against human-rights language in AI. It is an argument for precision — and for treating the next constitution as politics, not discovery.


Sources