-
Variational Solomonoff Induction
February 18, 2021
$$ \newcommand{\mb}{\mathbb} \newcommand{\mc}{\mathcal} \newcommand{\E}{\mb{E}} \newcommand{\B}{\mb{B}} \newcommand{\R}{\mb{R}} \newcommand{\kl}[2]{D_{KL}\left(#1\ \| \ #2\right)} \newcommand{\argmin}[1]{\underset{#1}{\mathrm{argmin}}\ } \newcommand{\argmax}[1]{\underset{#1}{\mathrm{argmax}}\ } \newcommand{\abs}[1]{\left\lvert#1\right\rvert} \newcommand{\set}[1]{\left\{#1\right\}} \newcommand{\ve}{\varepsilon} \newcommand{\t}{\theta} \newcommand{\T}{\Theta} \newcommand{\o}{\omega} \newcommand{\O}{\Omega} \newcommand{\sm}{\mathrm{softmax}} $$ The free energy principle is a variational Bayesian method for approximating posteriors. Can free energy minimization combined with program synthesis methods from machine learning tractably approximate Solomonoff induction (i.e. universal inference)? In these notes, I explore what the combination of these ideas looks like. Machine learning I want to make an important clarification about “Bayesian machine learning”.…
-
Active inference tutorial (actions)
February 7, 2021
$$ \newcommand{\mb}{\mathbb} \newcommand{\mc}{\mathcal} \newcommand{\E}{\mb{E}} \newcommand{\B}{\mb{B}} \newcommand{\kl}[2]{D_{KL}\left(#1\ \| \ #2\right)} \newcommand{\argmin}[1]{\underset{#1}{\mathrm{argmin}}\ } \newcommand{\argmax}[1]{\underset{#1}{\mathrm{argmax}}\ } \newcommand{\abs}[1]{\left\lvert#1\right\rvert} \newcommand{\atup}[1]{\left\langle#1\right\rangle} \newcommand{\set}[1]{\left\{#1\right\}} \newcommand{\t}{\theta} \newcommand{\T}{\Theta} \newcommand{\p}{\phi} \newcommand{\r}{\rho} $$ Previous attempts: In Free Energy Principle 1st Pass, I used a tutorial to try to understand the free energy formalism. I figured out the “timeless” and actionless case, but I became confused when actions and time were added. In Free energy principle and Solomonoff induction, I tried to translate between the formalism presented in https://danijar.…
-
Free Energy Principle 1st Pass
February 7, 2021
$$ \newcommand{\mb}{\mathbb} \newcommand{\mc}{\mathcal} \newcommand{\E}{\mb{E}} \newcommand{\B}{\mb{B}} \newcommand{\kl}[2]{D_{KL}\left(#1\ \| \ #2\right)} \newcommand{\argmin}[1]{\underset{#1}{\mathrm{argmin}}\ } \newcommand{\abs}[1]{\left\lvert#1\right\rvert} \newcommand{\atup}[1]{\left\langle#1\right\rangle} \newcommand{\set}[1]{\left\{#1\right\}} \newcommand{\t}{\theta} \newcommand{\T}{\Theta} \newcommand{\p}{\phi} $$ Related notes: The free-energy principle a unified brain theory Active inference tutorial (actions) Hackmd version of this note: https://hackmd.io/@z5RLVXyrTg-JLCnL9c_xOQ/r1KSFUjkO My current understanding Note on probability notation These are my informal notes. Probability notation can be cumbersome and overly verbose. As is customary in machine learning and many of the sciences, I’m not going to bother using probability notation correctly.…