Turing.jl has a guide that shows how to write a model that allows you to either pass data or a missing
argument. In the first case it will get you the posterior, in the second case, instead, it will draw from all the variables.
Setup
using Turing
iterations = 1000
ϵ = 0.05
τ = 10
The working example is like this:
@model gdemo(x, ::Type{T} = Float64) where {T} = begin
if x === missing
# Initialize `x` if missing
x = Vector{T}(undef, 2)
end
s ~ InverseGamma(2, 3)
m ~ Normal(0, sqrt(s))
for i in eachindex(x)
x[i] ~ Normal(m, sqrt(s))
end
end
# Construct a model with x = missing
model = gdemo(missing)
c = sample(model, HMC(0.01, 5), 500);
I tried to adapt this code to a simple coin flip. The bernoulli
distribution, I think, should have data type Bool
(I tried with Int64
too though). I tried doing this:
@model coinflip2(y) = begin
if y === missing
y = Vector{Bool}(undef, 4)
end
p ~ Beta(1, 1)
N = length(y)
for n in 1:N
y[n] ~ Bernoulli(p)
end
end;
chain = sample(coinflip2(missing), HMC(ϵ, τ), iterations);
But it gave me a long error message I don't get but quite likely point to a type problem:
MethodError: Bool(::ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:p, :y),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:p,Tuple{}},Int64},Array{Beta{Float64},1},
Trying to mimic the syntax from the working gdemo
function above (which I don't fully get), gives you:
@model coinflip2(y, ::Type{T} = Bool) where {T} = begin
if y === missing
y = Vector{T}(undef, 4)
end
p ~ Beta(1, 1)
N = length(y)
for n in 1:N
y[n] ~ Bernoulli(p)
end
end;
chain = sample(coinflip2(missing), HMC(ϵ, τ), iterations);
which also fails with an error message that starts with:
MethodError: Bool(::ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:p, :y),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:p,Tuple{}},Int64},Array{Beta{Float64},1}
How do I write this correctly? Bonus points for explaining what I'm doing wrong :D Thanks!
Hamiltonian Monte Carlo works only on continuous distributions, since it differentiates a function of the PDF. When you sample from the joint model, (p, y)
, this also applies to the random variable y
, which, being Bernoulli, is certainly not continuous. Hence, the AD system used in the background (ForwardDiff
by default) will complain.
You can solve this by specifying a non-Hamiltonian sampler for y
using Gibbs
:
chain = sample(coinflip2(missing), Gibbs(HMC(ϵ, τ, :p), MH(:y)), iterations)
MH might not be the best choice, you'd have to think about that yourself, but it works.
Note that is not necessary for parameter estimation: just p
given y
is continuous and makes HMC happy:
chain = sample(coinflip2([0,1,1,0]), HMC(ϵ, τ), iterations)