UTF-8

related topics
{math, number, function}
{system, computer, user}
{language, word, form}
{law, state, case}
{style, bgcolor, rowspan}
{rate, high, increase}
{area, part, region}
{film, series, show}

UTF-8 (UCS[1] Transformation Format — 8-bit) is a multibyte character encoding for Unicode.  Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set, but unlike them, possesses the advantages of being backward-compatible with ASCII and of avoiding the complications of endianness and the resulting need to use byte order marks (BOM). For these and other reasons, UTF-8 has become the dominant character encoding for the World-Wide Web, accounting for more than half of all Web pages.[2][3]  The Internet Engineering Task Force (IETF) requires all Internet protocols to identify the encoding used for character data, and the supported character encodings must include UTF-8.[4]  The Internet Mail Consortium (IMC) recommends that all e‑mail programs be able to display and create mail using UTF-8.[5]  UTF-8 is also increasingly being used as the default character encoding in operating systems, programming languages, APIs, and software applications.

UTF-8 encodes each of the 1,112,064[6] code points in the Unicode character set using one to four 8-bit bytes (termed “octets” in the Unicode Standard).  Code points with lower numerical values (i. e., earlier code positions in the Unicode character set, which tend to occur more frequently in practice) are encoded using fewer bytes,[7] making the encoding scheme reasonably efficient.  In particular, the first 128 characters of the Unicode character set, which correspond one-to-one with ASCII, are encoded using a single octet with the same binary value as the corresponding ASCII character, effectively making valid ASCII text valid UTF-8-encoded Unicode text as well.

The official IANA code for the UTF-8 character encoding is UTF-8.[8]

Contents

Full article ▸

related documents
Binary-coded decimal
JavaScript
REXX
Hamming code
Garbage collection (computer science)
SQL
Fuzzy control system
RSA
Convolution
Hyperreal number
Continuous function
Computable number
Monte Carlo method
Primitive recursive function
Fundamental group
Multivariate normal distribution
Euler's formula
Dual space
BCH code
Dynamic programming
Fundamental theorem of algebra
Basis (linear algebra)
Prime number theorem
Ackermann function
Bessel function
Group action
Abelian group
Halting problem
Probability theory
Fermat number