Thinking in Generative Models in pymdp¶
This guide explains how to map Active Inference concepts onto the list-based
model representation used in pymdp.
For a model with F hidden-state factors and M observation modalities over
T timesteps, a common joint factorization is:
Here, \(\pi\) refers to policies; this is a discrete latent variable whose realizations correspond to sequences of actions over time, and \(u_t^\pi\) denotes the action entailed by policy \(\pi\) at time \(t\).
In pymdp, those terms are typically factorized as:
Mental model¶
Use two independent indexing systems:
- Observation modalities (
m) index independent sensory channels. - Hidden-state factors (
f) index latent causes.
In code, this means:
- modality-indexed lists:
A[m],observations[m],C[m],pA[m] - factor-indexed lists:
D[f],qs[f],B[f],pB[f]
Observation modalities: A[m] and observations[m]¶
Each modality gets its own likelihood tensor:
A[m] = P(obs_m | state_{i in A_dependencies[m]})observations[m]is the actual observation for modalitym
So if you have two modalities (for example, location and reward), you should
expect len(A) == 2 and your observation input to have two modality entries.
Hidden-state factors: D[f], qs[f], B[f]¶
Each hidden-state factor gets its own prior, posterior, and transition model:
D[f]: prior over factor-fstatesqs[f]: posterior over factor-fstatesB[f] = P(s_{[f,t+1]} | state_{i in B_dependencies[f]}, action_{j in B_action_dependencies[f]})
This means you can reason about each factor semantically (for example, context, location, or object identity) and keep dimensions aligned by factor index.
Dependencies connect modalities and factors¶
pymdp lets you be explicit about sparse structure:
A_dependencies[m]: which factors modalitymdepends onB_dependencies[f]: which previous factors influence factorftransitionsB_action_dependencies[f]: which action/control factors influence factorftransitions
This is the key to building structured models without forcing every modality to depend on every factor.
Minimal indexing example¶
# Two modalities
# m=0 -> location observation
# m=1 -> reward observation
A = [A_location, A_reward]
obs = [obs_location, obs_reward]
# Two hidden-state factors
# f=0 -> location state
# f=1 -> context state
D = [D_location, D_context]
qs = [qs_location, qs_context]
B = [B_location, B_context]
Read this as:
A[1]defines how reward observations are generated.qs[0]is your current belief over location states.B[1]defines context dynamics.
Common shape/indexing pitfalls¶
len(observations)does not matchlen(A).- Mixing modality index
mand factor indexf. A_dependencies,B_dependencies, orB_action_dependenciesindices not matching your list layout.- Passing raw integer observations when your setup expects categorical vectors (or vice versa).
How this maps to the agent loop¶
At each step:
- infer states with
infer_states(...)to updateqs[f], - infer policies with
infer_policies(qs), - sample actions and propagate beliefs through
B[f].
For end-to-end loops, combine this mental model with the
rollout() guide.