An unknown process provides data samples z in [0,1]. The density of probability corresponding to this process is unknown. Nevertheless, let us model it with the following distribution :
The distribution depends on some parameter θ which is θ=0.3 in the above figure. The value of θ corresponding to the process is thuns unknown... and we would like to determine it, as we are provided with successive data samples. The distribution depicted above is called the model.
Let us consider the following a priori distribution for the parameter θ. Let us choose a parabolic one :
So in our world, the parameter θ is tossed according to the above density of probability, and then, the data samples are tossed according to the model parametrized by this θ value. The model is thus the probability density of the samples z, knowing the actual value of θ. In other words, the model is a conditional probability.
The probability of a sample z to be tossed depends finally on the two above densities of probability, which leads to a joint probability for the occurrence of (z,θ) as follows :
Let us now toss the data samples z according to the model, with a parameter θ=0.7. This value is the unknown one that the inference principle is expected to discover. The following shows the update of the density of probability for the parameter θ while samples are provided. This is the Bayesian inference. This density of probability actually focuses around the value θ=0.7.
These movies are the same, except for the initial prior abount the θ distribution.