Linear classifier

related topics
{math, number, function}
{rate, high, increase}
{service, military, aircraft}
{household, population, female}

In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object's characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector.

Contents

Definition

If the input feature vector to the classifier is a real vector \vec x, then the output score is

where \vec w is a real vector of weights and f is a function that converts the dot product of the two vectors into the desired output. (In other words, \vec{w} is a one-form or linear functional mapping \vec x onto R.) The weight vector \vec w is learned from a set of labeled training samples. Often f is a simple function that maps all values above a certain threshold to the first class and all other values to the second class. A more complex f might give the probability that an item belongs to a certain class.

For a two-class classification problem, one can visualize the operation of a linear classifier as splitting a high-dimensional input space with a hyperplane: all points on one side of the hyperplane are classified as "yes", while the others are classified as "no".

A linear classifier is often used in situations where the speed of classification is an issue, since it is often the fastest classifier, especially when \vec x is sparse. However, decision trees can be faster. Also, linear classifiers often work very well when the number of dimensions in \vec x is large, as in document classification, where each element in \vec x is typically the number of occurrences of a word in a document (see document-term matrix). In such cases, the classifier should be well-regularized.

Generative models vs. discriminative models

There are two broad classes of methods for determining the parameters of a linear classifier \vec w [1][2]. Methods of the first class model conditional density functions P(\vec x|{\rm class}). Examples of such algorithms include:

Full article ▸

related documents
Transfinite induction
Enriched category
Floor and ceiling functions
Tuple
Fermat's little theorem
Twin prime
Bounded set
String searching algorithm
Möbius inversion formula
Pre-Abelian category
Dimension (vector space)
Gaussian integer
Burali-Forti paradox
PSPACE
Connected space
Square-free integer
ElGamal encryption
Elementary function
Fuzzy set
Principal ideal
Banach algebra
Commutator
Discrete space
Entailment
Initial and terminal objects
Möbius function
Torsion subgroup
Monomorphism
Cyclone (programming language)
HMAC