author: niplav, created: 2025-08-27, modified: 2025-10-27, language: english, status: in progress, importance: 4, confidence: possible
I ~formalize Pascal's mugging for utility-maximizing Solomonoff inductors with a simplicity prior, and then characterize a set of utility functions that are immune to mugging. The utility functions in this set have strongly diminishing marginal returns, utilities must grow slower than , that is, slower than the exponential of the inverse of a busy-beaver-like function. This "breaks the boundedness barrier" for mugging-immune utility functions.
Switching to a Levin-style speed prior solves this kind of Pascal's mugging.
I first attempt to formalize Pascal's mugging, then characterize a set of mugging-immune utility functions, and finally speculate on how one might escape the very harsh bound I have discovered.
In previous discussions, a commonly-agreed upon solution to Pascal's mugging was to have a bounded utility function, which is unsatisfying from a utilitarian perspective—a utilitarian ideally wants to be able to say that twice as much of a good thing is twice as good, or at least say that strictly more of a good thing is strictly better.
Summary: The prior probability of worlds decreases exponentially with increasing program length, but the utility of some program output can grow very quickly: Almost as fast as the busy beaver numbers.
Consider a muggee that implements Solomonoff induction as its epistemic process. Specifically, for the set of computable programs with binary tape state, the prior probability assigned by for is proportional to —we're using the simplicity prior1.
That is, intuitively, the prior probability decreases exponentially in the length of the program.
Let be the muggee's quasilinear utility function, that is, the muggee has a numéraire which their utility function is linear in. This numéraire can be anything from money, happy observer-moments, paperclips, you name it. We can call the shortest program that outputs one instance of our numéraire .
We can now handwavingly construct a mugging program from .
Given a budget of bits (): First it runs
and thus writes an instance of our numéraire
to the tape, and then copies that instance as many times as
it can before it halts. The number of copies written to the
tape in the second step is upper-bounded by the busy-beaver
number , that is the maximum number of steps possibly taken
by the second part program. (It is unfortunately not bounded by
, the number of 1s written to the tape, since
writing the numéraire may require writing zeroes.)
More detail on : You can take the busy beaver program, and then add some business logic where the numéraire output is interleaved between the busy beaver logic, plus some subroutines that copy and move more than one step.
busy beaver step
walk to offset
copy
walk back
busy beaver step
This program is a constant number of bits larger than the busy beaver program, namely the code for the "business logic".
will, I believe, behave busy-beaver-like in that it writes an enormous number of to the tape before halting. It won't quite write copies of the numéraire on the tape, but it will grow much faster than any standard sequence one may care to name, definitely faster than any polynomial, faster than factorials, surely faster than the Ackermann function… and, most notably, much much faster than an exponential.
Definition: I'll give the growth rate of the name , the "copy-beaver number", which is the maximal number of copies a program of length can make of a pre-written tape-state.
The copy-beaver number is definitely smaller than the busy-beaver number, but my best guess is that it still grows much faster than almost all functions we can characterize85%, that it has "busy-beaver-like" growth behavior.
This leads to a simple consequence: in , the expected value of the mugging program for the muggee is unbounded in program length.
I don't know if this is the only way of formalizing Pascal's mugging.
Summary: We can circumvent Pascal's mugging by having utilities grow extremely slowly in whatever goods we are presented with, almost as slowly as the inverse of the busy beaver function.
Note that this is not solved by having logarithmically diminishing returns in some resource, if we believe that the copy beaver function has a slower growth rate than a double exponential, which, nah.
Note that in this case isn't a numéraire anymore, since utility functions are linear in those.
Definition: I'll call the goods a utility function is monotonic but not linear in "pseudo-numéraires". is just some constant.
But we now have an interesting foothold! Logarithmic utility grows too quickly, but we can identify the growth rate of utility for our pseudo-numéraire so that we don't get this kind of divergence of utilities.
Specifically, we need a utility function immune to mugging so that , which (under the assumption that is invertible) leads to:
(Abusing notation, is the utility of copies of the pseudo-numéraire.)
Definition: We can now identify a set of "mugging-immune utility functions" with a single pseudo-numéraire :
This result isn't exact—there may be programs that are mugger-like but grow somewhat faster (e.g. by not first writing the pseudo-numéraire and then copying, but doing something more complex and interleaved.) But the result is "spiritually correct".
One issue that may come up is that a muggee has multiple pseudo-numéraires. I'm pretty sure that a small number of pseudo-numéraires isn't a problem and difficult to mug, assuming that all of them grow slower than the bound outlined above.
But if a muggee has a truly astronomical number of pseudo-numéraires that are easy to generate programmatically, e.g. all prefix codes, we do get issues. Similar to a "death by a thousand cuts" the mugger basically tells us "Ah, but what about a world in which you have this one thing you like the first instance of, and this other thing, and yet this other thing…", for a lot of different numéraires.
Specifically, if the number of pseudo-numéraires is greater than , then a mugger of Kolmogorov complexity can still succeed by writing one instance of each pseudo-numéraire on the tape, receiving the high utility from the first marginal instance of each pseudo-numéraire. The number of numéraires here needs to increase very rapidly because there's such harsh diminishing returns for each numéraire.
I think that this will not happen in practice for most utility functions, but it seemed worth noting.
These bounds are pretty harsh. The inverse copying beaver grows very slowly99%; you're not quite upper-bounded in utility, but you might as well be.
Is there hope?
Well, maybe. I elided a lot of constant factors, asymptotic growth rates &c from the sketch above. My hope is that one can "choose" the level at which one becomes mugging-immune by selecting the right constants and tweaking the growth rate. The resulting utility function should then be ~linear for the resources attainable in our current guesses for the attainable value in the reachable universe, and grows more slowly after that. After all, the inverse copying beaver grows so slowly that it can look like a bounded utility function if the constants are selected right.
But I don't have a great intuition for when that kind of constant-picking is possible and when it isn't; we may be stuck.
One way of defeating Pascal's mugging is to switch to a different prior, specifically one in which the construction of the copy-beaver receives a very low prior probability—as low as the fake utility it "provides".
One possible solution is to use the speed prior instead of the simplicity prior: For the set of computable programs with binary tape state, the prior probability assigned by by the speed prior for is proportional to where is the number of steps takes before halting2.
I think that the speed prior simply solves mugging in this formalization.
My sketchy reason for believing this looks like this:
However: It may be that normalization upweights the prior probability under the speed prior by a "superexponential amount", that is, the vanishes because normalization moves the program up by a significant amount in terms of prior probability. My guess is that that doesn't happen (and GPT-5 assures me it doesn't), but I think to prove it would require thinking about the structure of computable programs, which I'm not very good at.
Intuitively, the prior cares to equal amounts about how long the program is and how long it takes to execute. This makes total sense to me, a human mind composed of neurons trained with an algorithm selected for by natural and sexual selection: I care about my theories maximally compressing the world, sure, but I also care about being able to finish thinking them.
A cool fact about this is that it's almost a pure a-priori argument: You need to know nothing about the real world, and have to make remarkably few mathematical assumptions3, to arrive at this conclusion: Mixing in a speed prior into your complexity prior saves you from this type of copying Pascal's mugging.
All logic is a prior.
—M Ls, Comment on “Updatelessness doesn't solve most problems”, 2024
I imagine an extremely advanced agent coming into existence, doing some mathematics, and then modifying its prior based on pure mathematical reasoning.
Thanks to Claude 4 & 4.5 Sonnet, Claude 4 & 4.1 Opus and GPT-5 for discussions on this topic.
I have a very impractical idea: Can we have a pure speed prior? One that only cares about how fast programs finish running?
The intuitive answer may seem "no" at first, since unlike with program length we have, for every number of steps a program needs to halt, countably infinitely many programs that take that many steps to halt, forcing us to say that any particular program has a prior Lebesgue measure of zero.
I think this is technically called a universal semimeasure. I don't think this distinction kills the below arguments/proof-sketches85%, but obviously can't guarantee it. ↩
Indeed, since we normalize our prior so that it sums to , theoretically we can take any function of our programs and create a new prior by normalizing or . can be sophistication, logical depth, number of bitflips performed during program execution… any crazy thing you can come up with as long as returns natural numbers. ↩
Diverging utilities are bad, you're a Solomonoff inductor whose prior has been defined in a particular way. ↩