Summary statistics

related topics
{rate, high, increase}
{math, number, function}
{style, bgcolor, rowspan}
{system, computer, user}
{@card@, make, design}
{household, population, female}

In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount as simply as possible. Statisticians commonly try to describe the observations in

A common collection of order statistics used as summary statistics are the five-number summary, sometimes extended to a seven-number summary, and the associated box plot.

Entries in an analysis of variance table can also be regarded as summary statistics.[1]

Contents

Example

The following example using R is the standard summary statistics of a randomly sampled normal distribution, with a mean of 0, standard deviation of 1, and a population of 50:

> x <- rnorm(n=50, mean=0, sd=1)
> summary(x)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-1.72700 -0.49650 -0.05157  0.07981  0.67640  2.46700

[edit] Examples of summary statistics

[edit] Location

Common measures of location, or central tendency, are the arithmetic mean, median, mode, and interquartile mean.

[edit] Spread

Common measures of statistical dispersion are the standard deviation, variance, range, interquartile range, absolute deviation and the distance standard deviation. Measures that assess spread in comparison to the typical size of data values include the coefficient of variation.

The Gini coefficient was originally developed to measure income inequality and is equivalent to one of the L-moments.

[edit] Shape

Common measures of the shape of a distribution are skewness or kurtosis, while alternatives can be based on L-moments. A different measure is the Distance skewness, for which a value of zero implies central symmetry.

[edit] Percentiles

A simple summary of a dataset is sometimes given by quoting particular order statistics as approximations to selected percentiles of a distribution.

[edit] Dependence

The common measure of dependence between paired random variables is the Pearson product-moment correlation coefficient, while a common alternative summary statistic is Spearman's rank correlation coefficient. Distance correlation equals zero implies independence.

Full article ▸

related documents
Bureau of Labor Statistics
Physical quality-of-life index
Gross world product
Freigeld
List of metropolitan areas by population
Interquartile range
List of countries by population
Hazard
Appreciation
Lucy tuning
Multistage sampling
ILR scale
Bernoulli trial
Quartile
Robert Lucas, Jr.
Survey sampling
Precision agriculture
Statistical inference
Observational error
Lebec, California
Danderyd Municipality
Economic Recovery Tax Act of 1981
Full width at half maximum
Benbrook, Texas
Alternative assessment
Big Dumb Object
ICOMP
List of Canadian provinces and territories by area
Fruit machine
Pineville