niplav

Detailed comments and suggested fixes:

Model and Busy Beaver alignment
- Right now you mix “program length k” with a Busy Beaver variant defined in terms of “2 symbols, m states,” writing things like $$S(2, k - l(C_n))$$. That mismatch can be avoided by staying within a single UTM model with prefix-free programs.
- Suggestion: define a steps Busy Beaver relative to a fixed prefix-free UTM, e.g.
- $$BB_{\text{steps}}(n) := \max{ \text{steps}(p) : p \text{ halts and } |p| \le n}$$.
- Then define the copy beaver on the same machine as
  - $$CB(n) := \max{ \text{copies output by } p \text{ of a fixed target string } t : |p| \le n}$$.
- Because writing one more copy requires at least a constant number of steps (finite alphabet, single head, finite write speed), you get
  - $$CB(n) = \Theta(BB_{\text{steps}}(n))$$ up to constant factors depending on the target-string length and the machine.
- This keeps everything uniform and avoids the states/length mismatch.
Divergence under the simplicity prior
- With the above, $$BB_{\text{steps}}(n)$$ (and thus $$CB(n)$$) dominates every computable function, hence certainly every exponential in $$n$$. So your informal limit “mugger EV behaves like $$2^{-k}\,CB(k)\to\infty$$” is directionally correct.
- If you want a clean statement: for any fixed constant $$c>0$$, $$CB(n) / 2^{cn} \to \infty$$. Consequently a sequence of “copy-beaver” programs of increasing length yields unbounded expected utility under the Solomonoff length prior.
- Nit: Solomonoff’s prior is a universal semimeasure, not a measure. That doesn’t change your point.
The inverse-bound for mugging-immune utilities under the length prior
- Your derivation
- Require for each length $$k$$: $$2^{-k}\,U_i(CB(k)) \le c$$.
- Hence $$U_i(CB(k)) \le c\,2^k$$.
- Using an inverse of $$CB$$: $$U_i(x) \le c\,2^{CB{-1}(x)}$$.
- Two tweaks make this precise:
- Use the generalized inverse $$CB^{-1}(x) := \min{n: CB(n)\ge x}$$ (monotonic, well-defined even if $$CB$$ isn’t strictly increasing).
- Add a constant-factor slack from machine details; the clean version is
  - $$U_i(x) \le O!\left(2^{CB{-1}(x)}\right)$$.
- This does “break the boundedness barrier”: these utilities can be unbounded yet mugging-immune against copy-style muggers under the length prior. They are, however, extremely slowly growing, as you note.
“Log diminishing returns isn’t enough”
- Correct. Since $$\log CB(k)$$ outgrows any linear function in $$k$$ (indeed any computable function), $$2^{-k}\log CB(k)$$ still diverges. Your displayed inequality is fine.
Multiple pseudo-numéraires
- The informal “threshold” using $$CB(\log_2 k)$$ isn’t well-justified as written; it mixes description length and copy budgets without a clean construction.
- If you want a safe statement under the length prior alone: a mugger can spend bits to specify many distinct pseudo-numéraires and produce the first instance of each; if your utility has steep early marginal gains across a very large family of cheaply generatable goods, that can reintroduce problems. A precise threshold would need a model of how many distinct “first units” can be generated by a length‑$$k$$ program (which is again bounded by runtime, hence by $$BB_{\text{steps}}(k)$$, not by a mere function of $$k$$ like $$\log k$$).
The speed prior part (this is where a crisp general argument helps)
- What you wrote as a “speed prior” $$P(p) \propto 2^{-(l(p)+s(p))}$$ is not the usual Schmidhuber/Levin choice (see below), but it’s a valid time-penalized semimeasure, and it does suffice to kill this mugging.
- General lemma (captures both your exponential penalty and the classical Kt/Levin penalty):
- Consider a prior of the form $$P(p) \propto 2^{-|p|}\,g(s(p))$$, where $$g:\mathbb{N}\to\mathbb{R}_+$$ is nonincreasing.
- Assume any program that produces $$v$$ units of the good must take at least $$s \ge \alpha\,v$$ steps for some machine-dependent constant $$\alpha>0$$ (finite write speed).
- Then the expected utility contributed by a “copy-style” mugger of description length $$k$$ is at most
  - $$2^{-k}\,\sup_{s\ge 1} \alpha\,s\,g(s)$$.
- Conclusion: If $$\sup_s s\,g(s) < \infty$$, all such muggings are uniformly bounded (no divergence, no mugging).
- Instantiations:
- Your exponential penalty: $$g(s)=2^{-s}$$ gives $$\sup_s s\,2^{-s} \le 1$$. So EV ≤ constant·$$2^{-k}$$; tends to 0 and cannot dominate.
- Levin/“Kt” style speed prior: $$g(s)=1/s$$ (equivalently weighting by $$2^{-(|p|+\log s)}$$). Then EV ≤ constant·$$2^{-k}$$. This is the standard result: penalizing by runtime 1/s suffices.
- Any polynomial penalty $$g(s)=1/s^c$$ with $$c>1$$ also works; $$c=1$$ is already enough.
- Normalization worry
- The “could normalization upweight by a superexponential amount?” worry does not apply. The normalizing constant $$Z=\sum_p 2^{-|p|}g(s(p))$$ is a finite positive constant independent of $$k$$ (because $$2^{-|p|}$$ is summable over a prefix-free set and $$g(s)\le g(1)$$). Multiplying by $$1/Z$$ cannot cancel the per‑program runtime penalty in a $$k$$-dependent way.
- Upshot: a standard speed prior (e.g., Levin’s $$2^{-|p|}/s(p)$$) already blocks Pascal-style muggings that rely on producing vast amounts of the valued good via long runtimes. Your stronger exponential‑in‑steps penalty also works, but is stricter than necessary and nonstandard.
“Pure speed prior”
- The obstacle is not “Lebesgue measure zero.” The real issue is normalization: if you tried $$P(p)\propto g(s(p))$$ with no length penalty, the sum over all programs diverges because there are exponentially many prefix-free programs for each length and infinitely many lengths. You need a length penalty (e.g., $$2^{-|p|}$$) for Kraft-style summability. So a pure speed-only prior is not normalizable in the usual way.
Scope of your immunity result
- Your bound using $$CB$$ shows immunity against any mugger whose trick is “use a short program and a huge runtime to produce a lot of the pseudo-numéraire.” That’s the canonical PM failure mode under the simplicity prior.
- With a speed prior, the same finite-write-speed argument shows immunity not just to copying but to any scheme where the quantity of value produced scales at most linearly with steps. This includes “first unit across many goods” type muggings as well, since producing $$m$$ distinct first units still needs $$\Omega(m)$$ steps overall.
Small phrasing/math nits to consider fixing
- Replace “proportional to $$2^{-l(C)}$$” with “a universal prefix semimeasure that upper‑bounds by $$O(2^{-l(C)})$$.”
- Replace your limit display “$$ \lim_{k\to\infty} 2^{-k} CB(k) = \infty$$” with “$$2^{-k} CB(k)\to\infty$$” or “$$CB(k)$$ dominates every exponential in $$k$$.”
- When you write the inverse-bound, say explicitly that you’re using the generalized inverse and include the big‑O: $$U(k) \le O(2^{CB{-1}(k)})$$.
- Clarify that writing the numéraire may require zeros, hence use step-based BB (as you noted), not the “1s written” variant.
Optional strengthening you could add
- State the general “speed prior condition” explicitly:
- If your prior weights programs by $$2^{-|p|} g(s(p))$$ with nonincreasing $$g$$ such that $$\sup_{s} s\,g(s)<\infty$$, then no Pascal-style mugging via runtime amplification is possible for any monotone unbounded utility that values a good deliverable at finite write speed.
- This abstracts away from the exact growth rate of $$CB$$ and makes the “need for speed” point crisp.

Bottom line: - The copy-beaver construction and the “almost Busy Beaver like” growth are fine once you pin them to a steps‑based BB on a fixed UTM. - Your inverse‑CB bound for mugging‑immune utilities under the length prior is algebraically right; just use the generalized inverse and big‑O. - The speed prior idea is correct and can be made both simpler and stronger by using the standard Levin/Schmidhuber form; normalization cannot rescue the mugger. - The “pure speed prior” section should be revised: the issue is non-normalizability without a length penalty, not measure-zero per program.