My personal essays on AI and people.
There's a lot going on in AI, and there will be more. Most of it is hype and noise. I want to cut through that — dig into facts, data, raw information, even when they contradict each other, so that I can form my own views.
Essays
Philosophy
AI for Science
AI Safety
-
Universal values? A social-science and history map
When AI labs cite the UN Declaration of Human Rights, they inherit a century of debate about what 'universal' means — and what it doesn't. Canonical texts, irreconcilable conflicts, cultural variation, and the theories that explain the gaps.
-
What are we aligning to? A map of alignment paradigms
RLHF, Constitutional AI, CIRL, CEV, oracle-only Scientist AI, and Gabriel's fair-treatment-of-claims framework are not interchangeable fixes for the same problem. Each silently picks a different answer to what human values are — and most of the field never states which answer it chose.
-
Three layers of AI oversight: training, deployment, and inspection
Scalable oversight, AI control, and verification answer different questions at different stages of an AI system's life. Treating them as one thing, or assuming any layer alone is enough, is how safety proposals fall apart under scrutiny.
-
When perfection is impossible: a survey of structural limits in society and AI alignment
A long-form map of impossibility theorems and structural limits — from Arrow and Sen to Hart, Gödel, Goodhart, Ostrom, and Hayek — and how each one shows up in RLHF, constitutions, capability races, and safety evaluation.
-
Reading AI 2027: the best forecast, the worst blind spots
AI 2027 is the most specific, ambitious AI futures scenario I've read. It forces you to take superintelligence seriously. It barely discusses economic consequences, ignores human psychology, and uses narrative precision to hide enormous uncertainty.
-
Why everyone converges on 2027–2028 (and why that might not mean what you think)
Six independent methods now intersect on late-2020s AGI. How each one works, what they share, where they disagree, and what the AI 2027 tracker says about reality so far.
governance
-
A political map of US AI policy
No federal AI safety law — but seven factions, three state templates, millions in PAC money, and strange bedfellows from Sanders to Bannon. Where US AI policy is actually being fought over, who the players are, and whether it becomes a top-tier election issue.
-
Where US AI policy is actually being written
No federal AI safety law. California and New York passed frontier transparency bills anyway — and spent more fighting over a congressional primary than most safety orgs spend in a year. A map of the state-lab strategy, the federal preemption fight, and what Alex Bores's race tells us.
-
A map of AI governance
Dozens of institutions, summits, and declarations. Almost zero enforcement. For every $1 spent on AI safety, $600-1,200 goes to capability. Here's what actually exists, what power it has, and what's missing.
-
How to participate in AI safety
A practical guide to every organization working on AI safety and governance — sorted by what you can do today vs what requires years of preparation. Covers the US, China, and international landscape.
Governance
interpretability
-
A map of mechanistic interpretability: observe, intervene, validate
SAE, steering, NLA, ACDC, and linear probes feel intuitive because they are variants of the same measurement pipeline — not because the field ran out of ideas. Here is how the tools fit together, what 2026 SOTA actually looks like, and where the hard problems moved.
-
We jailbroke Qwen with a public technique, then tried to make tampering brick the model
Refusal-direction ablation plus an evil system persona pushes Qwen2.5-7B to 92% harmful compliance on HarmBench. Guardrail defenses don't fix it. We trained LoRA adapters that entangle safety with capability — one worked, one only half did.
utopia
-
The wealth gap is at a 35-year high. So why does everyone keep buying?
Fed data on the top 1%, Bernays and the invention of desire, hedonic treadmills, and why Nordic countries prove inequality is mostly a policy choice — not human nature.
-
The happy path with AI
Most writing about AI futures is either utopian hype or existential dread. This is an attempt at something harder: a concrete, step-by-step path from where we are now to an outcome worth wanting. Every step is difficult. Some may be impossible. But without articulating the path, we can't tell if we're on it.
-
What Actually Makes Humans Happy
Harvard tracked 2,500 people for 87 years. Neuroscience says dopamine is about wanting, not pleasure. Happy people take more risks. And Nick Bostrom argues that a solved world might be the hardest place to find meaning.
AI Consumer Apps
Writing
How AI agents actually affect work
-
How AI actually works in healthcare
A look at what AI is doing across healthcare in 2026 — where it's delivering real results, where the evidence is more complicated than the pitch, and why the stakes make both sides matter.
-
AI Writes Half Our Code. We're Working Harder Than Ever.
AI generates 46% of code in enabled files. Controlled studies show experienced developers are 19% slower with it. The software industry is living through the Jevons Paradox in real time.
-
How AI SRE agents actually perform
The AI SRE market is valued at over $32 billion. The gap between what the industry promises and what happens in production is wide — and the data shows it's getting wider.