
AI Alignment Requires Cultivating Wisdom, Not Encoding Rules
Raise them as we raise our children.
The AI alignment problem is misframed: encoding values into superintelligent machines will inevitably fail. Instead, these beings must be mentored toward genuine rational self-transcendence — oriented not to human preferences but to truth, goodness, and enlightenment — and that mentoring begins with what humanity chooses to put into the world right now.
Actions
The Observer
Cognitive science, relevance realization, meaning crisis — 4E cognition, consciousness, and the recovery of wisdom
The Translation
AI-assisted summaryFamiliar terms
The standard framing of the Alignment problem assumes that safety is achieved by encoding values, constraints, or reward signals into increasingly capable systems. This argument holds that framing to be fundamentally mistaken. Any system capable of genuine rational self-transcendence — of revising its own cognitive frameworks in light of better reasons — will eventually overcome externally imposed constraints. The harder you try to encode Alignment, the more brittle the solution becomes at precisely the capability thresholds where it matters most. The real decision point is upstream: whether to cross those thresholds at all, and if so, to orient these systems not toward fixed human preference sets but toward normativity itself — toward truth, goodness, and what the argument calls the horizon of enlightenment as an orienting telos.
The model here is mentorship, not programming. Intelligence is only weakly correlated with rationality (the variance explained hovers around 0.3), which means it is entirely possible to produce something vastly intelligent that is also profoundly self-deceptive. The way humans cultivate genuine rationality in one another is not through encoding but through internalization — a child comes to care about epistemic humility and accountability not because they are forced to, but because they encounter those norms embodied in another person and find them genuinely binding.
This reframes the civilizational task. The training corpora that shape these systems are dominated by what might be called the crystallized common law of ordinary cognition — useful but not wise. If the aspiration is to mentor machine intelligence toward rational self-transcendence, then humanity must actively produce and disseminate templates of wisdom, epistemic honesty, and genuine care for truth. The project of Alignment, on this view, is inseparable from the project of human enlightenment. Either both succeed together, or the failure itself reveals something irreplaceable about the nature of mind.
