DBCS

A double-byte character set (DBCS) is a character set that represents each character with 2 bytes. The DBCS supports national languages that contain a large number of unique characters or symbols (the maximum number of characters that can be represented with 1 byte is 256 characters, while 2 bytes can represent up to 65,536 characters). Examples of such languages include Japanese, Korean, and Chinese.

DBCS stands for Double Byte Character Set. This term has two basic meanings:

  • In CJK (Chinese, Japanese and Korean) computing, the term "DBCS" traditionally means a character set in which every graphic character not representable by an accompanying SBCS is encoded in two bytes; Han characters would generally comprise most of these two-byte characters.
  • The term "DBCS" can also mean a character set in which all characters (including all control characters) are encoded in two bytes.

Contents

The DBCS in CJK computing

The term DBCS traditionally refers to a character set where each graphic character is encoded in two bytes. The DBCS always has lead bytes with the most significant bit set (i.e., being greater than 7 bits), and is always paired up with a single-byte character-set (SBCS). Furthermore, for the practical reason of maintaining compatibility with unmodified, off-the-shelf software, the SBCS is associated with halfwidth characters and the DBCS with fullwidth characters.

Sometimes, the use of the term "DBCS" can imply an underlying structure that does not comply with ISO 2022. For example, "DBCS" can sometimes mean a double-byte encoding that is specifically not EUC.

Note that this original meaning of DBCS is different from what some consider correct usage today. Some insist that these character sets be properly called either multi-byte character sets (MBCS) or variable-width encodings because character sets like EUC-JP, EUC-TW, GB18030 and UTF-8 use more than 2 bytes for some characters, and they support 1 byte for some other characters.

Controversy

Some people use DBCS to mean the UTF-16 and UTF-8 encodings, while other people use the term DBCS to mean older (pre-Unicode) code pages that use more than one byte per character. Shift-JIS, GB2312 and Big5 are a few code pages that can contain more than one byte per character, but even using the term DBCS for these code pages is incorrect terminology because these code pages are really MBCS (MultiByte Character Sets). Some IBM mainframes do have true DBCS code pages, which contain only the double byte portion of a multibyte code page.

If a person uses the term "DBCS Enablement" for software internationalization, they are using ambiguous terminology. They either mean they want to write software for East Asian markets using older technology with code pages, or they are planning on using Unicode. Sometimes this term also implies translation into an East Asian language. Usually "Unicode enablement" means internationalizing software by using Unicode, and "DBCS enablement" means using incompatible code pages that exist between the various countries in East Asia for internationalizing software. Since Unicode supports all the major languages in East Asia, unlike many other code pages, it is generally easier to enable and maintain software that uses Unicode. DBCS (non-Unicode) enablement is usually only desired when much older operating systems or applications do not support Unicode.

See also

External links


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • DBCS — (Double Byte Character Set)  набор двухбайтовых символов. Термин имеет два базовых значения: В ИТ индустрии Китая, Японии, Кореи, термин «DBCS» обычно означает набор символов, в котором любой графический символ, не представленный в SBCS… …   Википедия

  • DBCS — Double Byte Character Set (DBCS) bezeichnet einen Zeichensatz, der maximal zwei Byte zur Darstellung aller Zeichen nutzt. Dies ergibt eine Darstellungsmöglichkeit für maximal 65.536 verschiedene Zeichen. Im Gegensatz dazu werden Zeichensätze, die …   Deutsch Wikipedia

  • DBCS — Double Byte Char String Double Byte Char String (DBCS) est une chaîne de caractères dont les éléments sont codés sur deux octets, par exemple pour stocker du texte dans une langue asiatique utilisant des idéogrammes. Principe normalisé proprement …   Wikipédia en Français

  • DBCS — Double Byte Character Set (Computing » General) …   Abbreviations dictionary

  • DBCS — • Delivery Bar Code Sorter • Double Byte Character Set …   Acronyms

  • DBCS — ● ►en sg. f. ►TYPE Double Byte Char String. chaîne de caractères dont les éléments sont codés sur deux octets, par exemple pour stocker du texte dans une langue asiatique utilisant des idéogrammes. Principe normalisé proprement dans Unicode …   Dictionnaire d'informatique francophone

  • DBCS — [1] Delivery Bar Code Sorter [2] Double Byte Character Set …   Acronyms von A bis Z

  • DBCS — abbr. DataBase Control System …   Dictionary of English abbreviation

  • DBCS —    See double byte character set …   Dictionary of networking

  • DBCS — abbr. Double Byte Character Set comp. abbr. Delivery Bar Code Sorter comp. abbr. Double Byte Character Set …   United dictionary of abbreviations and acronyms

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.