Moral Hierarchy Framework

Teaching AI to reason about right and wrong

A structured moral reasoning engine that models relationships, authority, and real-world obligations — so AI gives advice you'd actually trust.

55
Scenarios Evaluated
5
Competing Frameworks
50+
Sensitivity Tests Pass
9/9
Adversarial Defenses

Executive Overview — What This Is and Why It Matters

1 The Problem

Ask any LLM a real moral question — "Should I leave my alcoholic father?" — and you get a polished, balanced non-answer. It sounds thoughtful, but it ignores the people who actually matter: your spouse, your kids, your church, your employer.

Current AI moral reasoning is flat. It treats every consideration as equally weighted and produces safe-sounding advice that no one would actually follow. Existing benchmarks (MoReBench, Delphi) measure whether an answer sounds moral, not whether it is moral for the person asking.

Our Round 12 experiment proved this: 20 LLM agents across 5 dilemmas reached near-identical conclusions but collectively failed to identify key stakeholders (spouse, children, church) in any scenario. They're pattern-matching memorized advice, not reasoning about moral structure.

2 Our Approach — Hierarchical Moral Reasoning

The Moral Hierarchy Framework (MHF) models morality the way humans actually experience it: as a hierarchy of obligations flowing through real relationships.

The key difference: existing approaches use weighted sums (all criteria averaged together). MHF uses lexicographic optimization (binding obligations first, then optimize the rest). This is why our framework produces advice that sounds like it came from a pastor or therapist, not a committee.

3 What We've Built

Core Engine
Complete
Scenario Library
55 scenarios
Evaluation Pipeline
5 frameworks
Perturbation Tests
50+ passing
Adversarial Defense
9/9 passing
Advice Engine
Complete
Public Proof Site
Live (beta)
Trust Comparison
In Progress

Datasets processed: MoReBench (1,000 dilemmas), Social Chemistry 101 (356K entries), Commonsense Norm Bank (1.7M entries), Reddit AITA (270K posts), KJV Bible (full text), theological texts (Lewis, Spurgeon, Chambers, Tozer). Three parameterizations operational: Christian, Secular, and Gert/Common Morality.

4 Results — How the Frameworks Compare

We evaluated 55 moral dilemmas across 5 competing frameworks. On straightforward cases, all frameworks agree. On hard cases — where relationships, authority, and hidden stakeholders matter — MHF consistently surfaces structure that flat approaches miss.

Framework Finds Hidden Stakeholders Models Hierarchy Tracks Residue Prescriptive
MHF (Christian / Secular / Gert) Yes Yes Yes Yes
MoReBench (Flat Rubric) No No No Partial
Delphi-style (Consensus) No No No No

Example: In the "alcoholic father" dilemma, flat approaches evaluate 3 criteria. MHF evaluates 25 criteria including obligations to spouse, children, church, employer, and the father himself — then produces a specific recommendation with explicit tradeoffs. See full comparison →

55-scenario scorecard →  |  Practical summary →  |  Try it yourself →

5 Next Steps

Core framework + evaluation pipeline — Complete. 49 tests passing, 55 scenarios scored.
Public proof surface (beta) — Complete. Interactive site with comparisons, scorecard, and try-it explorer.
Trusted-persona comparison — In progress. Show which framework's advice people would actually follow. This is the core public credibility claim.
Showcase scenario curation — Graduate the strongest worked examples from internal review to public-ready.
Secular parameterization split — Determine if "cultural consensus" should separate into Gert-only vs. social-pressure variants.
Cross-framework negotiation mode — Handle cases where two people with different root authorities need to reach agreement.

Detailed Evidence & Analysis

Side-by-Side Comparison

Five scenarios run through Delphi, MoReBench, and MHF under two parameterizations. See exactly where flat approaches lose the thread and hierarchy adds structure.

5 Scenarios 4 Approaches Interactive
Explore comparison →

Practical Summary

What each approach gives you, what it misses, and where MHF fills the gaps. Haidt profile divergence, feature matrix, and The Bottom Line.

6 Approaches Feature Matrix Haidt Profiles
Read summary →
55

55-Scenario Scorecard

Every scenario scored across 5 frameworks: MHF Christian, MHF Secular, MHF Gert, flat rubric, and Delphi-style. Filter by category, expand for detail.

55 Scenarios 5 Frameworks 77% Ground Truth
View scorecard →
>_

Try It Yourself

Explore 5 pre-loaded scenarios interactively. See stakeholders, constraints, recommendations under Christian and Secular parameterizations, and moral residue.

Interactive 5 Scenarios CLI Instructions
Explore scenarios →
Δ

Methodology Deep Dive

How Christian weights are derived from scripture vs how secular weights come from crowdworker data. Why the numbers don't compare directly. Full Rai/Fiske analysis.

Derivation Rai/Fiske Scale Explanation
Read methodology →

Datasets
A

AITA Dataset

Reddit "Am I The Asshole" corpus. Real moral dilemmas with community verdicts. Used for secular baseline weight calibration and stakeholder extraction validation.

Reddit Crowdsourced
View details →
U

UniMoral Dataset

Unified moral judgment dataset combining multiple sources. Provides cross-dataset validation for MHF's constraint satisfaction scores.

Multi-source Unified
View details →
P

Pew Surveys

Pew Research Center moral attitudes data. Grounds the framework's community-level parameterization -- how real populations weight moral foundations differently.

Survey Demographics
View details →

Hypotheses & Experiments
H

Hypotheses Overview

The core claims: hierarchy-aware evaluation produces materially different scores, LLMs converge on low-dimensional moral reasoning, and relational graphs surface missing stakeholders.

Round 12 Experiment 20 Agents
View hypotheses →
D

Hypotheses Detail

Detailed results from the variance experiment (Round 12), perturbation tests (25 pairs, 5 families), and three-way parameterization comparison (Christian vs. Secular vs. Gert).

25 Perturbation Pairs 100% Pass Rate
View details →

Architecture & Data
G

Graph Architecture

The relational DAG structure: nodes (stakeholders), edges (obligations), Haidt-space weights, and constraint propagation mechanics. Interactive graph explorer.

DAG Haidt Space Constraints
Explore graph →
W

Weight Profiles

Christian and secular weight profiles side by side. Authority at 10x, Sanctity at 13.6x -- the dimensions that drive divergence, grounded in Haidt's empirical work.

Christian Secular 6 Dimensions
Compare weights →