ISO/IEC 8859

related topics
{language, word, form}
{system, computer, user}
{style, bgcolor, rowspan}
{math, number, function}
{country, population, people}
{area, part, region}

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

ISO/IEC 8859 parts 1, 2, 3, and 4 were originally Ecma International standard ECMA-94.

In June 2004, the ISO/IEC working group responsible for maintaining eight-bit coded character sets disbanded and ceased all maintenance of the ISO/IEC 8859 series. In the area of character encoding, ISO now concentrates on the Universal Character Set (ISO/IEC 10646); see also Unicode. In computing applications, encodings that provide full UCS support (such as UTF-8 and UTF-16) are finding increasing favor over 8-bit encodings such as ISO/IEC 8859-1.[citation needed]

Contents

Introduction

While the bit patterns of the 95 printable ASCII characters are sufficient to exchange information in modern English, most other languages that use the Latin alphabet need additional symbols not covered by ASCII, such as ß (German), ñ (Spanish), å (Swedish and other Nordic languages) and ő (Hungarian). ISO/IEC 8859 sought to remedy this problem by utilizing the eighth bit in an 8-bit byte to allow positions for another 128 characters. (This bit was previously used for data transmission protocol information, or was left unused.) However, more characters were needed than could fit in a single 8-bit character encoding, so several mappings were developed, including at least 10 just to cover the Latin script.

The ISO/IEC 8859-n encodings only contain printable characters, and were designed to be used in conjunction with control characters mapped to the unassigned bytes. To this end a series of encodings registered with the IANA add the C0 control set (control characters mapped to bytes 0 to 31) from ISO 646 and the C1 control set (control characters mapped to bytes 127 to 159) from ISO 6429, resulting in full 8-bit character maps with most, if not all, bytes assigned. These sets have ISO-8859-n as their preferred MIME name or, in cases where a preferred MIME name isn't specified, their canonical name. Many people use the terms ISO/IEC 8859-n and ISO-8859-n interchangeably. ISO/IEC 8859-11 did not get such a charset assigned, presumably because it was almost identical to TIS 620.

Full article ▸

related documents
Auxiliary verb
Mater lectionis
Stop consonant
Lingala language
Devanagari
Centum-Satem isogloss
Allophone
Hakka Chinese
List of linguistics topics
Occitan language
Latin grammar
Northwest Caucasian languages
Elvish languages (Middle-earth)
Tengwar
Newfoundland English
Comma (punctuation)
Kazakh language
Indo-Aryan languages
Article (grammar)
Standard Alphabet by Lepsius
Baltic languages
Kyrgyz language
Grammatical case
Clitic
Xhosa language
Longest word in English
Doric Greek
Anglo-Saxons
Kannada language
Istro-Romanian language