utf-8 (7) Linux Manual Page
UTF-8 – an ASCII compatible multibyte Unicode encoding
Description
The Unicode 3.0 character set occupies a 16-bit code space. The most obvious Unicode encoding (known as UCS-2) consists of a sequence of 16-bit words. Such strings can contain—as part of many 16-bit characters—bytes such as ‘
