Likelihood principle

related topics
{rate, high, increase}
{theory, work, human}
{math, number, function}
{law, state, case}
{math, energy, light}
{film, series, show}
{work, book, publish}
{system, computer, user}
{disease, patient, cell}
{mi², represent, 1st}

In statistics, the likelihood principle is a controversial principle of statistical inference which asserts that all of the information in a sample is contained in the likelihood function.

A likelihood function arises from a conditional probability distribution considered as a function of its distributional parameterization argument, conditioned on the data argument. For example, consider a model which gives the probability density function of observable random variable X as a function of a parameter θ. Then for a specific value x of X, the function L(θ | x) = P(X=x | θ) is a likelihood function of θ: it gives a measure of how "likely" any particular value of θ is, if we know that X has the value x. Two likelihood functions are equivalent if one is a scalar multiple of the other. The likelihood principle states that all information from the data relevant to inferences about the value of θ is found in the equivalence class. The strong likelihood principle applies this same criterion to cases such as sequential experiments where the sample of data that is available results from applying a stopping rule to the observations earlier in the experiment.[1]




  • X is the number of successes in twelve independent Bernoulli trials with probability θ of success on each trial, and
  • Y is the number of independent Bernoulli trials needed to get three successes, again with probability θ of success on each trial.

Then the observation that X = 3 induces the likelihood function

and the observation that Y = 12 induces the likelihood function

These are equivalent because each is a scalar multiple of the other. The likelihood principle therefore says the inferences drawn about the value of θ should be the same in both cases.

The difference between observing X = 3 and observing Y = 12 is only in the design of the experiment: in one case, one has decided in advance to try twelve times; in the other, to keep trying until three successes are observed. The outcome is the same in both cases.

Full article ▸

related documents
Value at risk
Arithmetic mean
Quality of life
Limits to Growth
Weighted mean
Lorenz curve
Best, worst and average case
Malthusian catastrophe
Actual infinity
Pareto efficiency
Chi-square distribution
Zipf's law
Multi-valued logic
Extension (semantics)
Principle of bivalence
Log-normal distribution
Natural language processing
Pareto principle
Present value
Accuracy and precision
Modus tollens