Research
We are building a multi-step pathway to safe advanced AI.
At the heart of our breakthrough research is the Scientist AI, a novel approach conceived by Yoshua Bengio that represents a distinctive safety-centered path towards ASI.
The Scientist AI is inspired by an ideal scientist: a mind that has internalized the laws of nature and uses them to make predictions, but without predilection about how things unfold. It is a highly intelligent machine that uses probabilistic reasoning to understand the world, but with no hidden goals or preferences. Its predictions are transparent, auditable and verifiable.
As we build towards safe advanced AI, we expect the Scientist AI to accelerate scientific breakthroughs, provide guardrails and oversight for agentic AI systems while advancing our understanding of the risks posed by AI and how to avoid them.
Scientific theories aspire to describe what is, as opposed to prescribe what ought to be. At LawZero, we take this idea as a design principle for safe artificial intelligence: that understanding—even of arbitrary depth and scope—can be disentangled from preference over how the world unfolds.
We distill into a non-technical overview the motivations and core components of the Scientist AI, a system that aspires to this ideal. Agency, we argue, rests on three pillars—affordances, goal-directedness, and intelligence—each a matter of degree. By limiting the first two while pursuing the third, we aim to build a system that is highly intelligent yet incapable of holding or pursuing goals of its own. The Scientist AI comprises a generator held accountable by a neutral estimator, allowing for creative thought without compromising safety. Two key ingredients are (i) contextualization, a transformation of the training data that disentangles facts from statements about such facts (e.g., opinions), and (ii) consequence invariance, a property of the training process that prevents feedback about downstream outcomes.
We believe this approach offers a promising path toward systems that are at once powerful, transparent, and safe, and that may serve as trustworthy anchors in a world of increasingly autonomous AI.