In statistics, a sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size. Samples are collected and statistics are calculated from the samples so that one can make inferences or extrapolations from the sample to the population. This process of collecting information from a sample is referred to as sampling.
The best way to avoid a biased or unrepresentative sample is to select a random sample, also known as a probability sample. A random sample is defined as a sample where each individual member of the population has a known, nonzero chance of being selected as part of the sample. Several types of random samples are simple random samples, systematic samples, stratified random samples, and cluster random samples.
A sample that is not random is called a nonrandom sample or a nonprobability sample. Some examples of nonrandom samples are convenience samples, judgment samples, purposive samples, quota samples, snowball samples, and quadrature nodes in quasiMonte Carlo methods.
Contents
Mathematical description of random sample
In mathematical terms, given a random variable X with distribution F, a random sample of length n =1,2,3,... is a set of n independent, identically distributed (iid) random variables with distribution F. ^{[1]}
A sample concretely represents n experiments in which we measure the same quantity. For example, if X represents the height of an individual and we measure n individuals, X_{i} will be the height of the ith individual. Note that a sample of random variables (i.e. a set of measurable functions) must not be confused with the realizations of these variables (which are the values that these random variables take). In other words, X_{i} is a function representing the measurement at the ith experiment and x_{i} = X_{i}(ω) is the value we actually get when making the measurement.
Full article ▸
