*author: niplav, created: 2023-12-21, modified: 2024-04-18, language: english, status: finished, importance: 3, confidence: certain*

Subscripts in text can be used to attach explicit probabilities to claims_{99%}.

Gwern has wondered about a use-case for subscripts in hypertext. While they have settled on a specific use-case, namely years for citations, I propose a different one: reporting explicit probabilities.

Explicitely giving for probabilities in day-to-day English text is usually quite clunky: "I assign 35% to North Korea testing an intercontinental ballistic missile until the end of this year" reads far less smoothly than "I don't think North Korea will test an intercontinental ballistic missile this year".

And since subscripts are a solution in need of a problem, one can wonder how well those two fit together: Quite well, I claim.

In short, I propose to append probabilities in subscript after a statement
using standard HTML subscript notation (or `$\LaTeX$`

as a fallback if
it's available), with the probability possibly also being a link to a
relevant forecasting platform with the same question:

I think Donald Trump is going to be incarcerated before 2030

_{65%}.

This is *almost* as readable as the sentence without the probability.

There are some complications with negations in sentences or multiple statements. For the most part, I'll simply avoid such cases ("Doctor, it hurts when I do this!" "Don't do that, then."), but if I had to, I'd solve the first problem by declaring that the probability applies to the literal meaning of the previous sentence, including all negations; the problem with multiple statements is solved by delimiters.

As an example for the different kinds of negation: "The train won't
come more than 5 minutes late_{90%}" would (arguendo) mean the
same thing as "I don't think the train will come more than 5 minutes
late_{90%}" means the same as "The train will take more than 5
minutes to arrive_{10%}" equivalent to "I assign 90% probability
to the train arriving within the next 5 minutes".

With multiple statements, my favorite way of delimiting is currently
half brackets: "I think ⸤it'll rain tomorrow⸥_{55%}, but
⸤Tuesday is going to be sunny⸥_{80%}, but I don't think
⸤your uncle is going to be happy about that⸥_{15%}."

The probabilities in this context aren't quite evidentials, but neither are they veridicals nor miratives, I propose the world "credal" for this category.

The exact place of insertion is subtle: In sentences with a single central statement, there are multiple locations one could place the probability.

- After the verb related to belief: "I think
_{55%}it'll rain tomorrow."- Advantage: Close to the word relating to the belief (which could reflect the strength of belief in itself, using "guess"/"wager"/"think"/"believe").
- Disadvantages:
- Conflicts with assigning probabilities to multiple statements.
- Puts visual clutter before the statement in question.

- At the end of the statement: "I think it'll rain tomorrow
_{55%}."- Advantages:
- Allows assigning probabilities to simple statements ("It'll rain tomorrow
_{55%}") and to multiple statements (see below). - Allows distinguishing the beliefs of different people. "I think
_{55%}it'll rain tomorrow, but Cú Chulainn disagrees_{22%}."

- Allows assigning probabilities to simple statements ("It'll rain tomorrow
- Disadvantage: If the probability is intended to contextualise the statement, this context is weaker if it is introduced
*after*the statement in question.

- Advantages:
- At the subject of the sentence: "I
_{55%}think it'll rain tomorrow."- Advantage: This can be used to distinguish the beliefs of different people. "I
_{55%}think it'll rain tomorrow, but Cú Chulainn_{22%}is skeptical about it." - Disadvantage: Putting the probability before the statement the probability is about feels quite unnatural.

- Advantage: This can be used to distinguish the beliefs of different people. "I

This becomes trickier in sentences with multiple statements.

- Probabilities after each subclaim: "I think it'll rain tomorrow
_{55%}, but Tuesday is going to be sunny_{80%}, but I don't think your uncle is going to be happy about that_{15%}.- Adding in delimiters to denote a specific subclaim the probability is about. I wonder whether there are better unicode characters for this, corner brackets might be a good candidate.
- Lower half brackets (or Quine corners which look almost the same): "I think ⸤it'll rain tomorrow⸥
_{55%}, but ⸤Tuesday is going to be sunny⸥_{80%}, but I don't think ⸤your uncle is going to be happy about that⸥_{15%}." - Upper half brackets to the left, lower half brackets to the right: "I think ⸢it'll rain tomorrow⸥
_{55%}, but ⸢Tuesday is going to be sunny⸥_{80%}, but I don't think ⸢your uncle is going to be happy aboutthat⸥_{15%}." - Subscripted parentheses: "I think
_{(}it'll rain tomorrow_{)}_{55%}, but_{(}Tuesday is going to be sunny_{)}_{80%}, but I don't think_{(}your uncle is going to be happy about that_{)}_{15%}." - Subscripted half guillemets: "I think
_{‹}it'll rain tomorrow_{›}_{55%}, but_{‹}Tuesday is going to be sunny_{›}_{80%}, but I don't think_{‹}your uncle is going to be happy about that_{›}_{15%}." - And subscripted full guillemets: "I think
_{«}it'll rain tomorrow_{»}_{55%}, but_{«}Tuesday is going to be sunny_{»}_{80%}, but I don't think_{«}your uncle is going to be happy about that_{»}_{15%}."

- Lower half brackets (or Quine corners which look almost the same): "I think ⸤it'll rain tomorrow⸥

- Adding in delimiters to denote a specific subclaim the probability is about. I wonder whether there are better unicode characters for this, corner brackets might be a good candidate.
- I basically rule out lists of probabilities after the verb relating to each subclaim, as it's very mentally taxing to relate each probability to each claim:
- "I think
_{55%, 80%, 15%}⸤it'll rain tomorrow⸥, but ⸤Tuesday is going to be sunny⸥, but I don't think ⸤your uncle is going to be happy about that⸥.

- "I think

A variant of the notation could use decimal notation instead
of percentages, and leave out trailing zeroes. "I think it'll
rain tomorrow`$_{50\%}$`

" would then become the more compact "I
think it'll rain tomorrow`$_{.5}$`

". This has the advantage of
being compatible with plain text through the combining dot below
diacritic, which would
yield "I think it'll rain tomorroẉ₅". However, the meaning of the
combining dot can be ambiguous to uninformed readers.

On LessWrong, one can also use reacts signifying probabilities on one's own text. While it's restricted to LessWrong, it also allows other people to easily assign different probabilities to your statements.

Since the people writing the text
reporting probabilities are probably logically
non-omniscient
bounded agents, it might as
well be useful to report the time or effort one has spent on refining
the reported probability: "I reckon humanity will survive the 21st
century_{55%:20h}", indicating that the speaker has reflected
on this question for 20 hours to arrive at their current probability
(something akin to reporting an "epistemic effort" for a piece of
information). I fear that this notation is getting into cumbersome
territory and won't be using it.

There are three available options: Either ones writing platform supports
HTML, in which case one can use the `<sub>18%</sub>`

tags (giving
_{18%}), or it supports `$\LaTeX$`

, which creates a sligthly
fancier looking but also more fragile notation using `_{18\%}`

(resulting
in `$_{18\%}$`

), or ones platform directly supports subscripting, such
as pandoc with `~18%~`

, but not
Reddit Markdown (which *does* support superscript). More info about other
platforms here.

Ideally one would simply use Unicode subscripts, which are available for all digits, but tragically not for the percentage sign '%' or a simple dot '.'. Perhaps a project for the future: After all, they did include a subscript '+'₊, a subscript '-'₋, equality sign '='₌ and parentheses '()'₍₎, but many subscript letters (b, c, d, f, g, j, q, r, u, v, w, y and z) are still missing…

I've used this notation sparingly but increasingly, a good example of a first exploration is here and interspersed in the text here.

Fischer 2023 uses a different notation:

- Given hedonism and conditional on sentience, we think (credence: 0.7) that none of the vertebrate nonhuman animals of interest have a welfare range that’s more than double the size of any of the others. While carp and salmon have lower scores than pigs and chickens, we suspect that’s largely due to a lack of research.
- Given hedonism and conditional on sentience, we think (credence: 0.65) that the welfare ranges of humans and the vertebrate animals of interest are within an order of magnitude of one another.
- Given hedonism and conditional on sentience, we think (credence 0.6) that all the invertebrates of interest have welfare ranges within two orders of magnitude of the vertebrate nonhuman animals of interest. Invertebrates are so diverse and we know so little about them; hence, our caution.

The notation proposed here would change the text:

- Given hedonism and conditional on sentience, we think that none of the vertebrate nonhuman animals of interest have a welfare range that’s more than double the size of any of the others
_{70%}. While carp and salmon have lower scores than pigs and chickens, we suspect that’s largely due to a lack of research.- Given hedonism and conditional on sentience, we think that the welfare ranges of humans and the vertebrate animals of interest are within an order of magnitude of one another
_{65%}.- Given hedonism and conditional on sentience, we think that all the invertebrates of interest have welfare ranges within two orders of magnitude of the vertebrate nonhuman animals of interest
_{60%}. Invertebrates are so diverse a nd we know so little about them; hence, our caution.

"Likelihood ratios are good! Likelihood ratios are the only good thing!"

"I agree that likelihood ratios are good! In fact, I think we have a moral responsibility to look for clever strategies to make the likelihood ratios bigger! But at the same time, you know, priors."

"Priors?! How dare you?! Priors are bad!"

*—Mark Taylor Saotome-Westlake, “Interlude X”, 2017*

For sharing a likelihood
ratio, we
need to talk about both the hypothesis `$H$`

*and* the evidence
`$E$`

. If I then want to say that `$E$`

updates `$H$`

by `$k$`

shannon, how could I
write that?

- No need to invent special notation, saying "
`$E$`

provides`$k$`

bits for/against`$H$`

" is enough. `$E⥌_{k}H$`

, specifically`$E⥜_{k}H$`

if`$E$`

is evidence for`$H$`

, and`$E⥝_{k}H$`

if`$E$`

is evidence*against*`$H$`

.- The variants
`$E⥣_{k}H$`

and`$E⥥_{k}H$`

in cases where`$E$`

is*strong*evidence.

- The variants