Stay hungry, stay foolish.
Parametric models assume some finite set of parameters . Given the parameters, future predictions are independent of the observed data. Therefore captures everything about the data. So the complexity of the model is bounded even if the amount of data is unbounded.
Non-parametric models assume that the data distribution cannot be defined in terms of such a finite set of parameters. But they can often be defined by assuming an infinite dimensional . Usually we think of as a function. The amount of information that can capture about the data can grow as the amount of data grows. This makes them more flexible.
Gaussian processes define a distribution of functions
where is the mean function and is the covariance function. We can think of Gaussian processes as “infinite-dimensional” Gaussians.
Dirichlet process is the infinite-dimensional generalization of the Dirichlet distribution. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributionns.
The Dirichlet process is specified by a base distribution and a positive real number called the concentration paramster. The base distribution is the expected value of the process, i.e., the Dirichlet process draws distribution “around” the base distribution.
The formal difinition is given as follows. Given a measurable set , a base probability distribution and a positive real number , the Dirichlet process is a stochastic process whose sample path is a probability distribution over , such that the following holds. For any measurable finite partition of , denoted , if , then
where denotes the Dirichlet distribution.