niplav

author: niplav, created: 2025-08-27, modified: 2025-10-27, language: english, status: in progress, importance: 4, confidence: possible

Mugging-Immune Utility Functions

I first attempt to formalize Pascal's mugging, then characterize a set of mugging-immune utility functions, and finally speculate on how one might escape the very harsh bound I have discovered.

In previous discussions, a commonly-agreed upon solution to Pascal's mugging was to have a bounded utility function, which is unsatisfying from a utilitarian perspective—a utilitarian ideally wants to be able to say that twice as much of a good thing is twice as good, or at least say that strictly more of a good thing is strictly better.

~Formalizing Pascal's Mugging

Summary: The prior probability of worlds decreases exponentially with increasing program length, but the utility of some program output can grow very quickly: Almost as fast as the busy beaver numbers.

Consider a muggee that implements Solomonoff induction as its epistemic process. Specifically, for the set of computable programs with binary tape state, the prior probability assigned by for is proportional to —we're using the simplicity prior¹.

That is, intuitively, the prior probability decreases exponentially in the length of the program.

Let be the muggee's quasilinear utility function, that is, the muggee has a numéraire which their utility function is linear in. This numéraire can be anything from money, happy observer-moments, paperclips, you name it. We can call the shortest program that outputs one instance of our numéraire .

Given a budget of bits (): First it runs and thus writes an instance of our numéraire to the tape, and then copies that instance as many times as it can before it halts. The number of copies written to the tape in the second step is upper-bounded by the busy-beaver number , that is the maximum number of steps possibly taken by the second part program. (It is unfortunately not bounded by , the number of 1s written to the tape, since writing the numéraire may require writing zeroes.)

Sketch

More detail on : You can take the busy beaver program, and then add some business logic where the numéraire output is interleaved between the busy beaver logic, plus some subroutines that copy and move more than one step.

This program is a constant number of bits larger than the busy beaver program, namely the code for the "business logic".

will, I believe, behave busy-beaver-like in that it writes an enormous number of to the tape before halting. It won't quite write copies of the numéraire on the tape, but it will grow much faster than any standard sequence one may care to name, definitely faster than any polynomial, faster than factorials, surely faster than the Ackermann function… and, most notably, much much faster than an exponential.

Definition: I'll give the growth rate of the name , the "copy-beaver number", which is the maximal number of copies a program of length can make of a pre-written tape-state.

The copy-beaver number is definitely smaller than the busy-beaver number, but my best guess is that it still grows much faster than almost all functions we can characterize_85%, that it has "busy-beaver-like" growth behavior.

This leads to a simple consequence: in , the expected value of the mugging program for the muggee is unbounded in program length.

Mugging Immunity

Summary: We can circumvent Pascal's mugging by having utilities grow extremely slowly in whatever goods we are presented with, almost as slowly as the inverse of the busy beaver function.

Note that this is not solved by having logarithmically diminishing returns in some resource, if we believe that the copy beaver function has a slower growth rate than a double exponential, which, nah.

Note that in this case isn't a numéraire anymore, since utility functions are linear in those.

Definition: I'll call the goods a utility function is monotonic but not linear in "pseudo-numéraires". is just some constant.

But we now have an interesting foothold! Logarithmic utility grows too quickly, but we can identify the growth rate of utility for our pseudo-numéraire so that we don't get this kind of divergence of utilities.

Specifically, we need a utility function immune to mugging so that , which (under the assumption that is invertible) leads to:

Definition: We can now identify a set of "mugging-immune utility functions" with a single pseudo-numéraire :

This result isn't exact—there may be programs that are mugger-like but grow somewhat faster (e.g. by not first writing the pseudo-numéraire and then copying, but doing something more complex and interleaved.) But the result is "spiritually correct".

Multiple Numéraires

One issue that may come up is that a muggee has multiple pseudo-numéraires. I'm pretty sure that a small number of pseudo-numéraires isn't a problem and difficult to mug, assuming that all of them grow slower than the bound outlined above.

But if a muggee has a truly astronomical number of pseudo-numéraires that are easy to generate programmatically, e.g. all prefix codes, we do get issues. Similar to a "death by a thousand cuts" the mugger basically tells us "Ah, but what about a world in which you have this one thing you like the first instance of, and this other thing, and yet this other thing…", for a lot of different numéraires.

Specifically, if the number of pseudo-numéraires is greater than , then a mugger of Kolmogorov complexity can still succeed by writing one instance of each pseudo-numéraire on the tape, receiving the high utility from the first marginal instance of each pseudo-numéraire. The number of numéraires here needs to increase very rapidly because there's such harsh diminishing returns for each numéraire.

I think that this will not happen in practice for most utility functions, but it seemed worth noting.

Escaping?

These bounds are pretty harsh. The inverse copying beaver grows very slowly_99%; you're not quite upper-bounded in utility, but you might as well be.

Well, maybe. I elided a lot of constant factors, asymptotic growth rates &c from the sketch above. My hope is that one can "choose" the level at which one becomes mugging-immune by selecting the right constants and tweaking the growth rate. The resulting utility function should then be ~linear for the resources attainable in our current guesses for the attainable value in the reachable universe, and grows more slowly after that. After all, the inverse copying beaver grows so slowly that it can look like a bounded utility function if the constants are selected right.

But I don't have a great intuition for when that kind of constant-picking is possible and when it isn't; we may be stuck.

The Need for Speed

One way of defeating Pascal's mugging is to switch to a different prior, specifically one in which the construction of the copy-beaver receives a very low prior probability—as low as the fake utility it "provides".

One possible solution is to use the speed prior instead of the simplicity prior: For the set of computable programs with binary tape state, the prior probability assigned by by the speed prior for is proportional to where is the number of steps takes before halting².

However: It may be that normalization upweights the prior probability under the speed prior by a "superexponential amount", that is, the vanishes because normalization moves the program up by a significant amount in terms of prior probability. My guess is that that doesn't happen (and GPT-5 assures me it doesn't), but I think to prove it would require thinking about the structure of computable programs, which I'm not very good at.

Intuitively, the prior cares to equal amounts about how long the program is and how long it takes to execute. This makes total sense to me, a human mind composed of neurons trained with an algorithm selected for by natural and sexual selection: I care about my theories maximally compressing the world, sure, but I also care about being able to finish thinking them.

A cool fact about this is that it's almost a pure a-priori argument: You need to know nothing about the real world, and have to make remarkably few mathematical assumptions³, to arrive at this conclusion: Mixing in a speed prior into your complexity prior saves you from this type of copying Pascal's mugging.

I imagine an extremely advanced agent coming into existence, doing some mathematics, and then modifying its prior based on pure mathematical reasoning.

Acknowledgements

Thanks to Claude 4 & 4.5 Sonnet, Claude 4 & 4.1 Opus and GPT-5 for discussions on this topic.

home