Successionism is a view of AI that claims that the creation of ASIs is
inherently good because they are our deliberately created successors.
The standard retort to successionism is that even if we try to
deliberately create successors, they may not be good or valuable
in the same way our successors would normally be, that it is a hard
problem to create superintelligent AIs that are moral patients and agents.
From here, the conversation usually splits into two different lines:
(1) That by default our far future human successors would also be created
"by accident" in a very uncontrolled way, and (2) that firmly locking in
current human values would be myopic, since there are surely much more
rich and interesting new forms of value to be discovered by letting
future ASIs explore the space of possible values. Sometimes there is
(3) a misunderstanding that anti-successionists don't believe that AIs
can be moral patients/agents in general.
The first remarks basically have the same underlying "vibe", they both
claim that value drift is not necessarily bad and/or avoidable, that
we've tolerated it so far and that it's turned out well/adaptively.
The standard non-successionist answer then is that
there may have been a misunderstanding, and that
the goal is not value lock-in at exactly 2025-level
WEIRD
moral sensibilities, but to lock in something
I'll call the abstract generator of human
values.
Basically, so the assumption goes, there is some findeable idealized
process which can compute all variants of trajectories of human values,
including WEIRD moral sensibilities, Aztec cultural practices (including
sacrifice)
and their likely evolution, Chinese
confucianism, Australian
aboriginal
circumcision, subincision and scarification initiation
rites,
and many others, able to extrapolate them
coherently
into a consistent vision of the future, while avoiding siren
worlds
or an overly narrow attractor state which loses out on the intrinsic
value of pluralism.
The goal is to find that generator and lock that in, plausibly seeded
with the current desires, wishes, utilities, wantings, urges, fetishes
and aspirations of humans that are currently alive, and notlock those
in directly.
As for (3): That's usually just a misunderstanding, especially when
transhumanists speak to transhumanists. There are people who don't
believe that ASIs can be moral patients or moral agents, sure, but are
they in the room with us right now? Also, for a transhumanist the future
modifications of some humans will plausibly make them qualitatively as
different from current humans as future ASIs from current humans.
After that the conversation usually gets a bit murky, and as a ~non-successionist1 I don't think
I understand the successionists well enough to represent them maximally
faithfully, but I'll try.
Six answers spring to mind:
If we lock in this abstract generator, we will lose out on moral progress outside of the envelope of this abstract generator. There are moral advancements that can only be made by minds that are vastly beyond humans, because there are axes of value that can only be discovered by exploring the entire space of possible values.
This generator of human values does not produce values that are adaptive enough to survive into the future, so if it could be locked in that would spell the end of life or self-replication in the observable universe, or will be outcompeted by systems focused on pure self-replication.
This generator of human values is too unconstrained, so the space of values is vacuously large, so the future is wildly unconstained and we get something as-if pulled from the space of possible values at random anyway2.
The generator of human values is something that's ~shared between all ~self-replicating entities anyway because game-theoretic considerations push towards it reliably, and humans were forged in a game-theoretic environment in the physical world which is shared by us and future ASIs.
There is a variant of this position that basically says that optimal/maximal adaptation/game-theoretic behavior/dissipation=ethics/morality/goodness.
The default trajectory of the universe, even with purely human successors, appears pretty misaligned wrt to current human values. And not only that, we'd be seen as wildly misaligned/mistaken according to the moral views of our predecessors: We're not pious enough, not sufficiently connected to nature, horrible in our inauthentic treatment of animals, strange in our focus on youth, disregard for elderly wisdom, and drastic cultural change. Yet, they didn't try to lock in their values, and we should be grateful for this, similarly we shouldn't lock in our values (however abstract).
Here's some thoughts on those arguments:
I find this one the most interesting one of the bunch.
I'm open to the option of radically upending our understanding of the cosmos, but I don't necessarily believe that such upending would necessarily entail a broadening of values—it could just as well be the case that novel philosophical considerations lead to the discovery that e.g. error theory is correct, or that most previous notions of value were incorrect, e.g. by discovering that solipsism is true and thus the number of moral patients is one.
The desire to unearth novel moral considerations and enact them is already present in somehumans, so it's also in the generator of human values. So propensity should not be a problem.
The desire of humans to figure out morality is stronger than their desire to avoid figuring out morality, especially if the cost is low70%.
Is capability a problem? Well, arguably the ASIs seeded with the generator of human values and the posthumans accompanying them are going to be as capable as the ones exploring the space of possible values by default.
Another question I have about the "explore all possible values" is whether the proponents indeed mean the space of all possible values.
I'd call this vacuously large, and will defend that that will almost surely be valueless.
My guess is that proponents mean something different by "the space of possible values", but it'd be worth it to elaborate on what that is, exactly.
This one worries me much more. It seems plausible that it's true30%, and if that's the case human values may go down fighting, but they will go down. Much more on this here, unfinished.
There's no harm in trying, surely.
I don't believe that the generator of human values is vacuously large.
There are tons of possible things humans don't care about: Arranging lungfish scales in hexagonal patterns, making the world as yellow as possible, introducing spiral turbulence in the gas clouds of jupyter, any random combination of atoms…
I do buy that we might be under-estimating how big the space of evolution of possible future human value trajectories could be, and it may be too vast to explore in full, so trade-offs will have to be made.
The argument banks on the generator being vacuous; a decent chance of there being moral convergence would mean that letting the future run wild is quite bad.
I don't think so.
That view needs to explain why a lot of the natural world includes strange phenomena in evolution and to a lesser extent predation, parasitism, hostile takeover of companies, invasions and the likes. (The latter aren't exactly counter human values, but human values usually include wanting to avoid those outcomes at mutual benefit.)
My answer depends on the variant but is basically "no", "what the hell", maybe "what into the payoff matrices though?"
There is a spectrum between (attempting to) gripping the future maximally tightly and letting it develop without ones own intervention3. I personally don't advocate for gripping the future maximally tightly.
How tightly exactly is up to debate but can we at least start haggling over the price?
At this point, in my experience, the conversation starts going in
circles. Or I don't remember what kinds of answers successionists give
to the counter-arguments successionists give.
AIs can be moral patients98% and agents99%, and there could be an extremely valuable future with no humans or human-descended minds but it's very unlikely0.1% to come to be by default. In most instigations of this future it would involve great amounts of theft (of the reachable universe from current humans), if not outright genocide of those humans, and thereby tarnishing the future even if it's otherwise quite valuable. ↩
Though: What measure on which space are we using here, exactly? ↩
This won't stop any other force from attempting to influence the values of humanity's successors… ↩