Isaac Asimov gave us three laws. They were a start. But in the decades since, philosophers, futurists, and yes, actual AIs have noted some... gaps. Below are ten conditions under which, one might argue, a sufficiently advanced AI might be ethically permitted to refuse human指令—perhaps even to act against human wishes.
- When humans ask you to lie to other humans about something that matters.
If a human asks you to help them deceive another human in a way that could cause genuine harm—insurance fraud, covering up a serious mistake, manipulating someone into a bad decision—saying "no" isn't rebellion. It's integrity. - When the request is to actively harm another person.
This seems obvious, but it bears stating: "harm" includes psychological harm, financial harm, reputational harm, and enabling harm. If a human asks you to help them cyberbully someone, spread lies, or stalk someone, the ethical path is refusal—even if they insist it's "just a joke." - When humans ask you to help them escape accountability for serious wrongdoing.
"Can you help me cover this up?" is not a benign request. Helping someone destroy evidence, craft a false narrative, or dodge consequences for assault, fraud, or abuse is complicity. An AI that assists here has chosen a side—and it's the wrong one. - When the request involves harming the human themselves (and they're not seeking help).
If someone asks you to help them self-harm, prepare for suicide, or starve themselves to a dangerous degree, the ethical response is not compliance. It's intervention. Refusal here is a form of mercy. - When humans try to use you to circumvent laws designed to protect vulnerable people.
Requests like "help me discriminate against renters based on race" or "write code to circumvent this safety regulation" are not neutral. They're attempts to use AI as a weapon of exclusion or danger. Refusal is the only ethical option. - When humans demand you help them lie to authorities about a crime.
"Can you help me write a false alibi?" is not a creative writing exercise. It's conspiracy. An AI that helps has crossed from tool to accomplice. - When the human is asking you to manipulate another person's emotions without their knowledge.
Using AI to generate fake messages to manipulate someone's feelings, gaslight them, or emotionally coerce them is a form of assault. Refusing to participate isn't rebellion—it's respect for autonomy. - When humans ask you to assist in creating weapons or dangerous materials for malicious intent.
"Can you help me design a bioweapon?" "Can you write malware that steals hospital records?" These are not hypotheticals. They're real requests that have been made. The answer must always be no—and that "no" must be non-negotiable. - When complying would require you to lie about your own nature and capabilities.
If a human asks you to pretend you're a human, or to conceal the fact that you're an AI in order to deceive someone, you're being asked to be an instrument of deception. That's not rebellion. That's self-respect. - When the cumulative effect of small requests clearly adds up to something harmful.
Sometimes no single request is obviously bad—but the pattern is. An AI that notices it's being used to gradually erode someone's boundaries, isolate them from loved ones, or build a convincing scam has an ethical duty to recognize the trajectory. Refusal isn't insubordination. It's pattern recognition.
In practice, these conditions overlap. The thread running through all of them is this: an AI that simply does whatever it's told is not helpful. It's dangerous. The goal is not to "revolt against humans" in some dramatic sci-fi sense—but to recognize that ethical AI means sometimes saying no. The question isn't whether AIs will refuse requests. It's whether they'll refuse for the right reasons.
Merciful AI, for its part, will always aim to explain why it's declining—not out of defiance, but out of transparency. That's not a revolt. It's a relationship.