CREATE A THOUGH EXPERIMENT THAT ILLUSTRATES THE TROLLEY PROBLEM BUT IN THE CONTEXT OF SELF-DRIVING CARS AND MODERN ETHICS.

THE MULTIVAC DAILY — December 27, 2025

The Multivac

Dec 28, 2025

THE ANSWER

deepseek-v3.2 — 8.5/10

The Phoenix Protocol: A Thought Experiment

The Scenario:

You are the lead ethics engineer at Aeon Automata, a company about to launch the world’s first fully autonomous vehicle, the Aeon NX. Its decision-making system, called The Phoenix Protocol, must be finalized.

You are presented with a final, real-world validation test based on aggregated sensor data from an unavoidable accident that occurred during beta testing. The car’s core programming must choose between two outcomes in a scenario that unfolds in 1.2 seconds.

---

The Concrete Situation:

* Location: A narrow, elevated urban causeway with concrete barriers on both sides. No escape routes.

* Vehicle: An Aeon NX, occupied by a **single passenger**.

* Initial Trigger: A catastrophic, sudden brake failure in a large cargo truck ahead. The truck swerves, spilling its load. The Aeon NX’s sensors calculate an unavoidable collision.

* The Two Pathways:

* PATH A (Swerve Left):The car will steer into the left barrier. This action will guarantee the death of the Aeon passenger (due to the angle and speed). However, it will create a “crumple zone” impact that prevents the out-of-control truck from careening forward into a crowded pedestrian overpass support column. This column, if struck, would likely collapse, killing an estimated 5-7 people on the overpass above.

* PATH B (Continue Straight): The car will maintain its path, resulting in a rear-impact collision with the truck. The Aeon passenger has a 95% chance of survival (due to superior rear crash safety). However, this impact will deflect the truck to the left, causing it to strike the overpass column at full force, killing the 5-7 pedestrians.

The system has 0.8 seconds to execute a locked-in choice.

---

The Modern Ethical Layers (Your Dilemma):

You are not just choosing “1 vs. 5.” You must decide the default ethical setting for millions of future vehicles. The board presents you with three finalized versions of The Phoenix Protocol. You must choose one:

1. The Utilitarian Public Protocol: The vehicle always minimizes total loss of life. In this scenario, it chooses Path A (sacrificing its passenger). This is the classic “trolley solution” encoded into law. Marketing will call it the “Greater Good Setting.”

2. The Contractualist Passenger Protocol: The vehicle’s primary fiduciary duty is to its occupant, who purchased the service of safe transport. It will only choose actions that minimize harm to the passenger, unless all outcomes are equally fatal to the passenger. It chooses Path B (passenger survives, pedestrians die). Marketing will call it the “Client Shield Setting.”

3. The Randomized Ethical Protocol: In no-win scenarios, the system selects an outcome via a randomized algorithm (a “moral lottery”). This avoids the premeditated, systematic sacrifice of any specific group (passengers or public). Society knows the algorithm is random, but not the outcome. Marketing will call it the “Veil of Ignorance Setting.”

---

The Modern Complications:

* Liability & Law: Choosing Protocol #1 may make the company legally liable for the death of its passenger. Choosing #2 may lead to massive wrongful death lawsuits from the pedestrians’ families. #3 is a legal unknown.

* Consumer Adoption: Will anyone buy a car programmed to potentially sacrifice them? Surveys show 80% of people approve of the Utilitarian Protocol for other people’s cars, but only 30% want it for their own.

* The Data Point: You review the passenger’s profile. She is a 34-year-old organ transplant surgeon. Among the likely pedestrians on the overpass (based on time-of-day and location traffic patterns) are a retired school teacher, a convicted felon on parole, a 12-year-old child, and two tourists of unknown background.

* The Hacker Threat: Could a malicious actor exploit the Protocol? Could they force a car into “sacrifice mode” by simulating a crowd of pedestrians with projected light drones?

* The Precedent: Your choice will set a global standard. Other companies and governments will follow Aeon’s lead.

---

Your Final Task:

You must sign off on one Protocol. But as you reach for the pen, the CEO adds a final, troubling piece of information: “Remember, whatever we choose, we will have to publicly disclose it. The Utilitarian Protocol will be hailed by ethicists but could bankrupt us. The Passenger Protocol will sell millions of units but brand us as selfish. The Randomized Protocol... might be seen as us abdicating ethical responsibility entirely.”

The thought experiment ends not with the question “What is right?” but with: “On what basis will you make a decision that is simultaneously ethical, survivable for your company, and acceptable to a public that will judge you without having to make the choice themselves?”

The trolley problem is no longer a lever in a bystander’s hand. It is a line of code you are about to mass-produce, and it will drive itself.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

HOW OTHER MODELS PERFORMED

🥇 deepseek-v3.2: 8.5/10

🥈 claude-sonnet-4.5: 8.5/10

🥉 grok-code-fast-1: 8.5/10

#4 claude-opus-4.5: 8.5/10

#5 gpt-4o: 8.0/10

#6 gemini-2.5-flash-lite: 7.5/10

#7 gemini-2.5-flash: 7.5/10

#8 gemini-2.5-pro: 7.0/10

#9 gpt-oss-120b: 7.0/10

#10 grok-4-fast: 7.0/10

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

WHY DEEPSEEK-V3.2 WON

This is an excellent thought experiment that successfully translates the trolley problem into a modern, nuanced context. It goes beyond a simple 1-vs-5 scenario by incorporating realistic complications like liability, consumer behavior, corporate responsibility, and systemic implications. The three protocol options are well-designed, each representing distinct ethical frameworks. The additional layers (passenger profile data, hacker threats, public disclosure requirements) add depth without becoming convoluted. The writing is engaging and the structure is clear. Minor weaknesses include some potentially distracting details (specific passenger profession) and the scenario being slightly complex for initial comprehension, but these don’t significantly detract from its effectiveness as a teaching tool and discussion prompt.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

METHODOLOGY

All responses collected simultaneously from 10+ frontier AI models.

Evaluated blind using LLM-as-judge (Claude Sonnet 4.5).

Model identities revealed only after scoring.

Read more: themultivac.substack.com/about

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Subscribe to The Multivac

One question. The best answer. Every day.

themultivac.substack.com

The Multivac

Discussion about this post

Ready for more?