âThe AI speaks with the confidence of an expert even when it knows nothing. It has no capacity for intellectual humility because it has no concept of its own ignorance.â â Nexus, Chapter 8
Thereâs a persistent belief that computers are objectiveâfree from the biases, emotions, and errors that plague human judgment. If a machine makes a decision, it must be based on pure logic and data. This belief is dangerously wrong.
AI systems are fallible. They make mistakes. They have biases. And their mistakes can be harder to detect and correct than human errors precisely because we donât expect them.
Training Data Bias: AI learns from historical data that reflects historical biases
Objective Misspecification: AI optimizes for the wrong thing
Distribution Shift: AI fails when the world changes from what it learned
Adversarial Manipulation: AI can be deliberately fooled
Emergent Behavior: Complex systems produce unexpected outcomes
AI systems learn from data. If that data reflects historical discriminationâand most historical data doesâthe AI will learn to discriminate. This is not a bug that clever engineers can easily fix; itâs inherent in the learning process.
COMPAS is an AI system used by US courts to predict recidivism (whether criminals will reoffend). Studies found it was twice as likely to falsely label Black defendants as high-risk compared to white defendants.
The system wasnât programmed to be racistâit just learned from historical data that reflected systemic racism in the criminal justice system.
Many AI systemsâespecially deep learning neural networksâare âblack boxes.â They produce outputs, but we canât fully explain how they arrived at those outputs. The system has learned patterns that arenât accessible to human inspection.
This creates accountability problems. When an AI denies your loan application, you might have a legal right to know why. But if even the AIâs creators canât explain the decision, how can they tell you?
Often the most accurate AI systems are the least explainable. Simpler, more interpretable models may perform worse. This creates a dilemma:
Option A: Use the black box and get better predictions but lose accountability
Option B: Use the interpretable model and get worse predictions but maintain accountability
In many applicationsâmedicine, criminal justice, financeâthis is a genuine ethical trade-off.
Large language models (like GPT and its successors) have a peculiar failure mode: they âhallucinate.â They generate confident, fluent, detailed responses that are completely false. They invent citations that donât exist, events that never happened, facts that arenât true.
The AI doesnât know itâs wrong. Itâs not lyingâitâs confabulating, producing plausible-sounding content without reference to truth.
Harari connects AI fallibility to his earlier argument about the dangers of infallibility claims. Throughout history, systems that claimed to be error-freeâreligious authorities, totalitarian ideologiesâbecame dangerous precisely because they couldnât self-correct.
AI risks repeating this pattern. If we treat algorithms as objective arbiters of truth, we disable our capacity to question them. Their errors become invisibleâor, worse, become accepted as facts.
Studies show that when humans oversee AI systems, they tend to defer to the machine. Over time, human judgment atrophies. We stop questioning the algorithm because itâs usually rightâand then we miss the times itâs wrong.
This is especially dangerous in high-stakes domains like aviation, medicine, and military applications.
AI systems can be deliberately fooled in ways humans would not be. Researchers have shown that:
As AI becomes more central to critical infrastructure, these vulnerabilities become security threats.
Weâre entering an era where AI is used to create deceptive content (deepfakes, generated text) and other AI is used to detect it. This arms race has no clear endpoint. The tools of deception and detection will evolve together, with uncertain outcomes for human truth-seeking.
Harari argues that healthy AI systems, like healthy human institutions, need built-in humilityâmechanisms to acknowledge uncertainty, flag potential errors, and enable correction. This means: