author: niplav, created: 2023-11-18, modified: 2024-09-30, language: english, status: in progress, importance: 2, confidence: likely
I finally make use of my daygame data by writing some code that implements a multi-armed bandit with Thompson sampling on beta-distributed estimates of what proportion of approaches in a particular location yield contact information.
Glattes Eis
Ein Paradeis
Für den, der gut zu tanzen weiß.
—Friedrich Nietzsche, “Für Tänzer”, 1882
Given the data of my daygame approaches, I've wondered for quite a while how I could use that data to make improve my game. I don't think I've found anything solid yet, so instead I'm going to try to use that data to estimate where I should do my next daygame session. Beliefs are for action, after all.
For this, I trick ChatGPT into writing code for a multi-armed bandit using Thompson sampling of beta-distributed value in Julia, with getting a contact information as a reward of 1 and not getting any contact information as a reward of 0.
(I know that this is a super impoverished view on what makes a good daygame approach, but this is an exploratory exercise. I might add more & different factors later.)
Of course, I can't tell ChatGPT that I am doing pickup, so I instead say that I'm looking to optimize the quality of icecream I'm eating by selecting different icecream shops. (Title of conversation: "Bayesian Icecream Bandit").
The resulting code is is wholly confused and bad, with multiple subtle and not so subtle bugs, and unelegant too—I reckon there's just not enough Julia training data to make it capable enough, but I haven't checked with the most recent models.
So after more than a year of procrastination, I decide to rewrite the code, the result is here.
If first loads the data, collects the number of successes (got contact
info) and failures (didn't get contact info), builds the corresponding
Beta distribution and past success ratio, throws it all into the DataFrame
bandit
and then samples from the distribution. (The Beta distribution
is useful here because the more samples have been collected, the smaller
the variance—and this is exactly what we want, since less-explored
locations should be sampled more often.)
So the output of the script can look something like this, where the most preferred option is at the bottom:
julia> sort(bandit, :sample)
32×7 DataFrame
Row │ location successes failures success_prob dist sample name
│ Float64 Int64 Int64 Float64 Beta… Float64 String
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 571077.0 0 1 0.0 Beta{Float64}(α=1.0, β=2.0) 0.0202956 [REDACTED]
2 │ 371851.0 1 10 0.0909091 Beta{Float64}(α=2.0, β=11.0) 0.0311173 [REDACTED]
3 │ 449256.0 3 37 0.075 Beta{Float64}(α=4.0, β=38.0) 0.0320887 [REDACTED]
4 │ 785084.0 2 35 0.0540541 Beta{Float64}(α=3.0, β=36.0) 0.0338077 [REDACTED]
5 │ 98955.0 1 7 0.125 Beta{Float64}(α=2.0, β=8.0) 0.0493673 [REDACTED]
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ [REDACTED]
29 │ 817198.0 1 1 0.5 Beta{Float64}(α=2.0, β=2.0) 0.619935 [REDACTED]
30 │ 276017.0 3 5 0.375 Beta{Float64}(α=4.0, β=6.0) 0.787144 [REDACTED]
31 │ 692404.0 0 1 0.0 Beta{Float64}(α=1.0, β=2.0) 0.826964 [REDACTED]
32 │ 295748.0 1 0 1.0 Beta{Float64}(α=2.0, β=1.0) 0.982625 [REDACTED]
23 rows omitted
The top option (namely 702595) is, unfortunately, in another city hundreds kilometers from where I live.
So I want to filter out irrelevant locations, so I create a set of locations that are amenable to weekday/weekend and good/bad weather daygame:
weekday_good_weather=[709269, 449256, 76108, 449052, 175735, 276017, 796877, 835159, 823073, 696163, 843941, 132388, 496077, 32441, 399686, 793915]
weekday_bad_weather=[709269, 449256, 76108, 449052]
weekend_good_weather=[692404, 10939, 709269, 157691, 175735, 276017, 702595, 449256, 76108, 793915, 796877, 835159, 823073, 696163, 531828, 781627, 843941, 132388, 496077, 371851, 32441, 399686, 449052]
weekend_bad_weather=[709269, 449256, 76108, 449052, 702595, 531828]
Then, on a weekday with good weather (as it often is, at the time of writing), I can then filter for locations in my current city with such conditions:
julia> filter(x->x[:location] in weekday_good_weather, sort(bandit, :sample))
14×7 DataFrame
Row │ location successes failures success_prob dist sample name
│ Float64 Int64 Int64 Float64 Beta… Float64 String
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 449256.0 3 37 0.075 Beta{Float64}(α=4.0, β=38.0) 0.0320887 [REDACTED]
2 │ 132388.0 9 71 0.1125 Beta{Float64}(α=10.0, β=72.0) 0.0543564 [REDACTED]
3 │ 175735.0 0 7 0.0 Beta{Float64}(α=1.0, β=8.0) 0.0923418 [REDACTED]
4 │ 823073.0 2 13 0.133333 Beta{Float64}(α=3.0, β=14.0) 0.11243 [REDACTED]
5 │ 449052.0 1 10 0.0909091 Beta{Float64}(α=2.0, β=11.0) 0.130813 [REDACTED]
6 │ 796877.0 0 3 0.0 Beta{Float64}(α=1.0, β=4.0) 0.153522 [REDACTED]
7 │ 696163.0 1 3 0.25 Beta{Float64}(α=2.0, β=4.0) 0.207392 [REDACTED]
8 │ 399686.0 2 5 0.285714 Beta{Float64}(α=3.0, β=6.0) 0.221436 [REDACTED]
9 │ 709269.0 22 87 0.201835 Beta{Float64}(α=23.0, β=88.0) 0.249274 [REDACTED]
10 │ 835159.0 5 16 0.238095 Beta{Float64}(α=6.0, β=17.0) 0.256881 [REDACTED]
11 │ 843941.0 0 1 0.0 Beta{Float64}(α=1.0, β=2.0) 0.272473 [REDACTED]
12 │ 496077.0 12 28 0.3 Beta{Float64}(α=13.0, β=29.0) 0.313311 [REDACTED]
13 │ 76108.0 1 1 0.5 Beta{Float64}(α=2.0, β=2.0) 0.617513 [REDACTED]
14 │ 276017.0 3 5 0.375 Beta{Float64}(α=4.0, β=6.0) 0.787144 [REDACTED]
The approach of using a multi-armed bandit here is nice because, if I follow it, it avoids both situations where I undervalue really great opportunities (because they're so overgamed nobody goes there anymore), and I can notice when locations do get worse. I had for example thought that 449256 was a great location, but the statistics definitely say otherwise, and similar with 449052.
Additional variables I could take into account would be my enjoyment of the approach, the attractiveness of the woman I'm speaking to, the amount of time I'm spending between approaches, …
I will, however, exercise my judgement: I'll probably take a closer look at 76108, even if I don't feel very enthusiastic about it.
And if I wanted to be really fancy, I could use a 2-dimensional Gaussian process, in kriging fashion, to interpolate geographical data and find the best daygame locations that way. Probably overkill.