ISO/IEC 10646 Questions and Answers | |
Q1. |
What are the benefits of unifying coding standards? |
A1. |
With a unified coding standard, computers are capable of accurately processing and displaying electronic information in different languages. Users no longer need conversion tools to handle electronic information encoded in different coding standards. Distortion of information can be reduced during electronic communication, thus facilitating the exchange of electronic information across geographical areas. |
---|---|
TOP | |
Q2. |
How does a unified coding standard benefit the development of a common Chinese language interface? |
A2. |
With a unified coding standard, computers in different parts of the world can display electronic information encoded in the same coding standard. Computers in Mainland China, Hong Kong and Taiwan can become capable of accurately displaying electronic information in traditional Chinese, simplified Chinese and Chinese characters specific to Hong Kong. Users no longer need to use different coding standards for the different sets of Chinese characters, thus avoiding the problems in electronic communication conducted in Chinese. |
TOP | |
Q3. |
What is the International Organization for Standardization (ISO)? |
A3. |
The ISO is a non-governmental organization established in 1947 (https://www.iso.org/). It comprises members from more than 160 countries. Its mission is to develop different international standards for facilitating the exchange in various areas (e.g. trade, information and technologies) among different parts of the world. |
TOP | |
Q4. |
What is the ISO/IEC 10646? |
A4. |
ISO/IEC 10646 is an international coding standard developed under the aegis of the International Organization for Standardization (ISO). It encodes the characters of the major languages of the world into a common character set. |
TOP | |
Q5. |
When was the ISO/IEC 10646 released? |
A5. |
The ISO released the first version of the ISO/IEC 10646 in 1993. It was called ISO/IEC 10646-1:1993.
|
TOP | |
Q6. |
What is the current development status of the ISO/IEC 10646? |
A6. |
Ideographic characters refer to those characters with appearance related to the meaning of the characters, such as
the Han characters. Inclusion of ideographic characters into the ISO/IEC 10646 is carried out in phases: i.e.
CJK Unified Ideographs Extension A block, CJK Unified Ideographs Extension B block, CJK Unified Ideographs Extension C
block, CJK Unified Ideographs Extension D block, etc.
|
TOP | |
Q7. |
What is ideographic character? |
A7. |
The International Organization for Standardization categorizes characters from different regions of the world by their characteristics. Ideographic characters refer to those characters with appearance related to the meaning of the characters. An example of ideographic character is Han characters mainly used in South East Asia countries or territories such as Mainland China, Hong Kong, Taiwan, Macao, Japan, South Korea, North Korea, Vietnam and Singapore. |
TOP | |
Q8. |
What is the Ideographic Research Group (IRG)? |
A8. |
The IRG is a working group under the International Organization for Standardization. Its mission is to develop ideographic characters in the ISO/IEC 10646. The IRG has developed CJK Unified Ideographs Block, the Extension A Block, the Extension B Block, the Extension C Block, the Extension D Block, the Extension E Block, the Extension F Block and the Extension G Block. |
TOP | |
Q9. |
Which countries/regions are members of the Ideographic Research Group? |
A9. |
IRG members include Mainland China, Hong Kong, Macao, Taipei Computer Association, Singapore, Japan, South Korea, North Korea, Vietnam and USA. Representatives from the Unicode Consortium also attend IRG meetings for coordinating the synchronization between the ISO/IEC 10646 and Unicode. |
TOP | |
Q10. |
What is Unicode? |
A10. |
Unicode is a character coding system designed by the Unicode Consortium to support the interchange, processing and display of the written texts of many languages in the world. The Unicode Consortium comprises mainly hardware and software vendors. |
TOP | |
Q11. |
What is the relationship between Unicode and the ISO/IEC 10646? |
A11. |
In 1991, the ISO and the Unicode Consortium decided to cooperate in defining a universal coding standard for multilingual texts. Since then, the two organizations have been working very closely to extend the ISO/IEC 10646 and Unicode, and to keep them synchronized. The ISO releases information of characters and code points in the ISO/IEC 10646, while the Unicode Consortium supplements the characters and code points with implementation algorithms and semantics information. The ISO/IEC 10646 and the corresponding version of Unicode are code-to-code identical. Unicode can be regarded as the implementation version of the ISO/IEC 10646. Therefore, products supporting Unicode also support the ISO/IEC 10646. |
TOP | |
Q12. |
What is ISO/IEC 10646 Extension B and what benefit does it bring? |
A12. |
The 32-bit code point is a pair of 16-bit code points, called surrogates. Surrogates are code points from two special ranges of
Unicode values called lead and trail surrogates.
The original design of Unicode was to use 16-bit code point to represent about 65,000 characters only. After years of development, it is known that 16-bit code point is insufficient to represent all the common scripts used worldwide. With the adoption of 32-bit code point, the limit is extended to 1 million characters which are enough to represent all the common scripts. The adoption of 32-bit code point extends the capability to use all ideographic characters encoded in the ISO/IEC 10646. The latest version of ISO/IEC 10646 contains more than 70,000 ideographic characters including the characters of the Kangxi Dictionary, Hanyu Dazidian and Hanyu Dacidian. The adoption of 32-bit code point provides more commonly used ideographic characters to facilitate the daily electronic communication conducted in Chinese more accurately and efficiently. |
TOP | |
Q13. |
Is the new version of ISO/IEC 10646 and HKSCS (collectively referred to as the "Standard") backward compatible with its corresponding old version? |
A13. |
The new version of the "Standard" is backward compatible with its corresponding old version. However, in respect of software implementation, newly included characters in the new version of the "Standard" may not be properly viewed or displayed on software platforms that support previous version of the "Standard". In addition, existing software applications that support previous version of the "Standard" may not be able to handle properly newly included characters, including those HKSCS characters with code points assigned by the ISO in the new version of the "Standard". When users encounter problems in handling Chinese characters in the course of using GovHK Online Services, they may make reference to the FAQ section. |
TOP | |
Q14. |
How can I browse the version of ISO/IEC 10646? |
A14. |
The ISO/IEC 10646 version of this Chinese website is encoded with UTF-8, which is supported by the most commonly used web browsers such as Google Chrome and Mozilla Firefox. To browse the ISO/IEC 10646 version of this website, please refer to the following steps:
|
TOP | |
Q15. |
My computer platform supports ISO/IEC 10646. Why are some Chinese characters in the documents or certificates issued by some institutions not exactly the same as those displayed on my computer platform? |
A15. |
ISO/IEC 10646 provides a unified character coding standard for the communication and exchange of electronic information. How the glyphs, i.e. shapes of characters, represented by the character codes are displayed or printed depends on the fonts selected by the application software. The Ideographic Research Group under ISO/IEC 10646 examines for unification glyphs from different character sources according to the Procedure for the unification and arrangement of CKJ Ideographs (Annex S of the ISO/IEC 10646 document). Unifiable glyphs are assigned the same code point. This means a code point may represent one or more glyphs. The table below exemplifies the different glyph shapes of a single code point affected by the use of different fonts.
Notes
Since the documents or certificates in question may be printed by computer systems or equipment with a default font different from that of your computer, it is possible that some of the glyphs therein are not exactly the same as those displayed on your computer platform. There is a video available at https://www.youtube.com/watch?v=WEvJqfUZwcEwhich demonstrates how to find the ISO/IEC 10646 code point of a Chinese character. For further information on character sources and unification, please refer to the ISO/IEC 10646 document available for download at https://standards.iso.org/ittf/PubliclyAvailableStandards/. The document may help you determine whether some similar glyphs are unifiable. |
TOP | |
Q16. |
Some Chinese characters may have multiple glyphs, such as “悦” and “悅” . To support the glyph “悦”, can we simply change the glyph of “悅” to “悦” in the font file technically? |
A16. |
Some Chinese characters may have multiple glyphs but the form of certain glyphs cannot be changed arbitrarily. It is because the different glyphs of a character may have been assigned separate code points in the ISO/IEC 10646. Changing the form of a glyph may result in identical glyphs in two different code points. For example, “悦” and “悅” are assigned separate code points U+60A6 and U+6085 respectively:
To support the glyph “悦” , a font developer should work on the glyph with code point U+60A6, instead of changing the glyph of U+6085 from “悅” to “悦” . Otherwise, there will be identical glyphs in the two code points U+60A6 and U+6085, which will be confusing and undesirable for electronic data interchange. More examples can be found in Annex S of the ISO/IEC 10646 document, which is available for downloading at https://standards.iso.org/ittf/PubliclyAvailableStandards/. |
TOP | |
Q17. |
What is Web Open Font Format (WOFF)? What are the benefits of using WOFF? |
A17. |
WOFF is an open format which is standardized by the World Wide Web Consortium (W3C) for using fonts on the Web. After using WOFF, web browsers will automatically download and temporarily install fonts when accessing the server for web pages. Users are not required to separately download and install fonts to their operating system for the display of content. |
TOP |