Blog on Hanyuan Jiang

Humanity’s Thousand-Year Alignment Experiment

Mon, 02 Mar 2026 00:00:00 +0000

When we discuss the problem of AI alignment, we tend to view it as an unprecedented technological challenge. However, human society has been conducting an alignment experiment for thousands of years. The object of this experiment is not silicon-based intelligence, but carbon-based intelligence itself. We call it law.

In fact, I think the core dilemma of aligning legal systems with AI is strikingly similar. How to constrain the infinite possible behavior of an agent with finite rules? How to maintain the predictability of the system while pursuing justice? How do you balance the need for strict adherence to norms with the need for flexibility to respond to situations? More fundamentally, what does alignment mean when we can’t even agree on the “goal of alignment” itself? After a long evolution from Hammurabi’s Code to modern constitutional law, this experiment has not produced a perfect solution, but I think the lessons it has accumulated do provide a definitive answer to the problem of artificial intelligence alignment. It’s a never-ending quest.

The Rating You See Is Pricing Tomorrow

Thu, 26 Feb 2026 00:00:00 +0000

In competitive games like League of Legends or Valorant, your visible rank updates after every match. The standard story says that number measures skill. But something ‘stranger’ happens the moment the same number also determines who you face next, which queues you can enter, and whether your friends can still play with you.

Imagine a new ranked arena with clean outcomes: two players, one match, one rating that rises when you win and falls when you lose. The official explanation sounds familiar. The number tracks your underlying strength. This is the good old Elo dream in modern UI, and recent theory gives that dream real substance by showing that Elo can be understood as a serious online learning rule under a Bradley-Terry model.¹¹ Olesker-Taylor and Zanetti (2024) analyze Elo through Markov chains and formalize when it tracks latent skill rather than just acting as competitive folklore. Yet the moment the rating also allocates your next opponents and your next set of options, it begins to do more than estimate what you are.

Research Interests

Sun, 22 Feb 2026 00:00:00 +0000

My research focuses on machine learning, especially frontier deep learning and reinforcement learning architecture, as well as LLM alignment. I am also interested in game theory and mechanism design.

Drafting in progress…

On Learning, Longing, and All the Ideas We Cannot Name

Fri, 20 Feb 2026 00:00:00 +0000

There are nights when the world feels almost structured enough to reveal its secret. I lie awake thinking about the quiet impossibility at the center of learning. A child hears scattered fragments of language and somehow extracts the grammar of an entire tongue. A bird sees the stars rotating overhead and knows which direction to migrate. A mathematician stares at symbols until patterns crystallize that were always there but never visible. Structure appears where none was visibly given. Something in the mind finds what the world does not openly display.