home

niplav

author: niplav, created: 2025-10-08, modified: 2025-12-25, language: english, status: in progress, importance: 2, confidence: likely

.

Some Thoughts on the Stupid Successionism Debate

Successionism¹ is a view of AI that claims that the replacement of humanity by artificial intelligences would be inherently good because those AI would be our deliberately created successors.

The standard retort to successionism is that even if we try to deliberately create successors, they may not be good or valuable in the same way our successors would normally be, that it is a hard problem to create superintelligent AIs that are moral patients and agents.

From here, the conversation usually splits into two different lines: (1) That by default our far future human successors would also be created "by accident" in a very uncontrolled way, and (2) that firmly locking in current human values would be myopic, since there are surely much more rich and interesting new forms of value to be discovered by letting future ASIs explore the space of possible values. Sometimes there is (3) a misunderstanding that anti-successionists don't believe that AIs can be moral patients/agents in general.

The first remarks basically have the same underlying "vibe", they both claim that value drift is not necessarily bad and/or avoidable, that we've tolerated it so far and that it's turned out well/adaptively.

The standard non-successionist answer then is that there may have been a misunderstanding, and that the goal is not value lock-in at exactly 2025-level WEIRD moral sensibilities, but to lock in something I'll call the abstract generator of human values. Basically, so the assumption goes, there is some findeable idealized process which can compute all variants of trajectories of human values, including WEIRD moral sensibilities, Aztec cultural practices (including sacrifice) and their likely evolution, Chinese confucianism, Australian aboriginal circumcision, subincision and scarification initiation rites, and many others, able to extrapolate them coherently into a consistent vision of the future, while avoiding siren worlds or an overly narrow attractor state which loses out on the intrinsic value of pluralism.

The goal is to find that generator and lock that in, plausibly seeded with the current desires, wishes, utilities, wantings, urges, fetishes and aspirations of humans that are currently alive, and not lock those in directly.

As for (3): That's usually just a misunderstanding, especially when transhumanists speak to transhumanists. There are people who don't believe that ASIs can be moral patients or moral agents, sure, but are they in the room with us right now? Also, for a transhumanist the future modifications of some humans will plausibly make them qualitatively as different from current humans as future ASIs from current humans.

After that the conversation usually gets a bit murky, and as a ~non-successionist² I don't think I understand the successionists well enough to represent them maximally faithfully, but I'll try.

Six answers spring to mind:

If we lock in this abstract generator, we will lose out on moral progress outside of the envelope of this abstract generator. There are moral advancements that can only be made by minds that are vastly beyond humans, because there are axes of value that can only be discovered by exploring the entire space of possible values.
This generator of human values does not produce values that are adaptive enough to survive into the future, so if it could be locked in that would spell the end of life or self-replication in the observable universe, or will be outcompeted by systems focused on pure self-replication.
This generator of human values is too unconstrained, so the space of values is vacuously large, so the future is wildly unconstained and we get something as-if pulled from the space of possible values at random anyway³.
The generator of human values is something that's ~shared between all ~self-replicating entities anyway because game-theoretic considerations push towards it reliably, and humans were forged in a game-theoretic environment in the physical world which is shared by us and future ASIs.
There is a variant of this position that basically says that optimal/maximal adaptation/game-theoretic behavior/dissipation=ethics/morality/goodness.
The default trajectory of the universe, even with purely human successors, appears pretty misaligned wrt to current human values. And not only that, we'd be seen as wildly misaligned/mistaken according to the moral views of our predecessors: We're not pious enough, not sufficiently connected to nature, horrible in our inauthentic treatment of animals, strange in our focus on youth, disregard for elderly wisdom, and drastic cultural change. Yet, they didn't try to lock in their values, and we should be grateful for this, similarly we shouldn't lock in our values (however abstract).

Here's some thoughts on those arguments:

I find this one the most interesting one of the bunch.
1. I'm open to the option of radically upending our understanding of the cosmos, but I don't necessarily believe that such upending would necessarily entail a broadening of values—it could just as well be the case that novel philosophical considerations lead to the discovery that e.g. error theory is correct, or that most previous notions of value were incorrect, e.g. by discovering that solipsism is true and thus the number of moral patients is one.
2. The desire to unearth novel moral considerations and enact them is already present in some humans, so it's also in the generator of human values. So propensity should not be a problem.
  1. The desire of humans to figure out morality is stronger than their desire to avoid figuring out morality, especially if the cost is low_70%.
3. Is capability a problem? Well, arguably the ASIs seeded with the generator of human values and the posthumans accompanying them are going to be as capable as the ones exploring the space of possible values by default.
4. Another question I have about the "explore all possible values" is whether the proponents indeed mean the space of all possible values.
  1. As in, all possible utility functions on universe-histories?
  2. I'd call this vacuously large, and will defend that that will almost surely be valueless.
  3. My guess is that proponents mean something different by "the space of possible values", but it'd be worth it to elaborate on what that is, exactly.
This one worries me much more. It seems plausible that it's true_30%, and if that's the case human values may go down fighting, but they will go down. Much more on this here, unfinished.
1. There's no harm in trying, surely.
I don't believe that the generator of human values is vacuously large.
1. There are tons of possible things humans don't care about: Arranging lungfish scales in hexagonal patterns, making the world as yellow as possible, introducing spiral turbulence in the gas clouds of jupyter, any random combination of atoms…
2. I do buy that we might be under-estimating how big the space of evolution of possible future human value trajectories could be, and it may be too vast to explore in full, so trade-offs will have to be made.
3. The argument banks on the generator being vacuous; a decent chance of there being moral convergence would mean that letting the future run wild is quite bad.
I don't think so.
1. That view needs to explain why a lot of the natural world includes strange phenomena in evolution and to a lesser extent predation, parasitism, hostile takeover of companies, invasions and the likes. (The latter aren't exactly counter human values, but human values usually include wanting to avoid those outcomes at mutual benefit.)
My answer depends on the variant but is basically "no", "what the hell", maybe "what into the payoff matrices though?"
There is a spectrum between (attempting to) gripping the future maximally tightly and letting it develop without ones own intervention⁴. I personally don't advocate for gripping the future maximally tightly.
1. How tightly exactly is up to debate but can we at least start haggling over the price?
2. E.g. like saying "hm, I'd like there to be no pure replicator hell" or "I don't think we should prevent our successors from valuing positional goods". More specificity would be good here.

At this point, in my experience, the conversation starts going in circles. Or I don't remember what kinds of answers successionists give to the counter-arguments successionists give.

The term is an exonym, since people usually described as successionists often present this perspective as the default one. ↩
AIs can be moral patients_98% and agents_99%, and there could be an extremely valuable future with no humans or human-descended minds but it's very unlikely_0.1% to come to be by default. Most realizations of this future would involve great amounts of theft (of the reachable universe from current humans), if not outright genocide of humans, and thereby tarnishing the future even if it's otherwise quite valuable. ↩
Though: What measure on which space are we using here, exactly? ↩
This won't stop any other force from attempting to influence the values of humanity's successors… ↩