Sent to you by Jeffye via Google Reader:

 
 

via LingPipe Blog by lingpipe on 9/9/09

Bayesian Inference is Based on Probability Models

Bayesian models provide full probability distributions over both observable data and unobservable model parameters. Bayesian statistical inference is carried out using standard probability theory.

What’s a Prior?

The full Bayesian probability model includes the unobserved parameters. The marginal distribution over parameters is known as the “prior” parameter distribution, as it may be computed without reference to observable data. The conditional distribution over parameters given observed data is known as the “posterior” parameter distribution.

Non-Bayesian Statistics

Non-Bayesian statisticians eschew probability models of unobservable model parameters. Without such models, non-Bayesians cannot perform probabilistic inferences available to Bayesians, such as definining the probability that a model parameter (such as the mean height of an adult male American) is in a defined range say (say 5′6″ to 6′0″).

Instead of modeling the posterior probabilities of parameters, non-Bayesians perform hypothesis testing and compute confidence intervals, the subtleties of interpretation of which have confused introductory statistics students for decades.

Bayesian Technical Apparatus

The sampling distribution models the probability of observable data given unobservable model parameters .

The prior distribution models the probability of the parameters .

The full joint distribution over parameters and data is computed with the chain rule, .

The posterior distribution of the parameters given the observed data is derived from the sampling and prior distributions via Bayes’s rule,

The posterior predictive distribution for new data given observed data is the average of the sampling distribution over parameters proportional to their posterior probability,

The key feature is the incorporation into predictive inference of the uncertainty in the posterior parameter estimate. In particular, the posterior is an overdispersed variant of the sampling distribution. The extra dispersion arises by integrating over the posterior.

Conjugate Priors

Conjugate priors, where the prior and posterior are drawn from the same family of distributions, are convenient but not necessary. For instance, if the sampling distribution is binomial, a beta-distributed prior leads to a beta-distributed posterior. With a beta posterior and binomial sampling distribuiton, the predictive posterior distribution is beta-binomial, the overdispersed form of the binomial. If the sampling distribution is Poisson, a gamma-distributed prior leads to a gamma-distributed posterior; the predictive posterior distribution is negative-binomial, the overdispersed form of the Poisson.

Point Estimate Approximations

An approximate alternative to full Bayesian inference uses for prediction, where is a point estimate.

The maximum of the posterior distribution provides the-so called maximum a posteriori (MAP) estimate,

theta^* = argmax_{theta} p(theta|y) = argmax_{theta} p(y|theta) , p(theta)

If the prior is uniform, the MAP estimate is called the maximum likelihood estimate (MLE), because it maximizes the likelihood of the data . The MLE is popular among non-Bayesian statisticians because the prior may be dropped from the optimization because it only contributes a constant factor.

By definition, the unbiased estimator for the parameter is the expected value of the posterior,

bar{theta} = {mathbb E}_{p(theta|y)}[theta] = int_{Theta} theta , p(theta|y) , dtheta

Point estimates may be reasonably accurate if the posterior has low variance. If the posterior is diffuse, prediction with point estimates tends to be underdispersed, in the sense of underestimating the variance of the predictive distribution. This is a kind of overfitting which, unlike the usual situation of overfitting due to model complexity, arises from the oversimplification of the variance component of the predictive model.

 
 

Things you can do from here:

 
 

 Leave a Reply

(required)

(required)


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

使用腾讯微博登陆

Protected by WP Anti Spam
   
© 2011 Information Retrieval Blog Suffusion theme by Sayontan Sinha