[ Previous | Next | Contents | Glossary | Home | Search ]
AIX Version 4.3 General Programming Concepts: Writing and Debugging Programs

Converters Overview for Programming

National Language Support (NLS) provides a base for internationalization in which data often can be changed from one code set to another. Support of several standard converters for this purpose is provided. This section discusses the following aspects of conversion:

Converters Introduction

Data sent by one program to another program residing on a remote host may require conversion from the code set of the source machine to that of the receiver. For example, when communicating with a VM system, the workstation converts its ISO8859-1 data to an EBCDIC form.

Code sets define graphic characters and control character assignments to code points. These coded characters must also be converted when a program obtains data in one code set but displays it in another code set.

Two interfaces for conversions are provided:

iconv command Allows you to request a specific conversion by naming the FromCode and ToCode code sets.
libiconv functions Allow applications to request converters by name.

The system provides ready-to-use libraries of converters. You supply the name of the converter you want to use. The converter libraries are found in the /usr/lib/nls/loc/iconv/* and /usr/lib/nls/loc/iconvTable/* directories.

In addition to code set converters, the converter library also provides a set of network interchange converters. In a network environment, the code sets of the communications systems and the protocols of communication determine how the data should be converted.

Interchange converters are used to convert data sent from one system to another. Conversions from one internal code set to another require code set converters. When data must be converted from a sender's code set to a receiver's code set or from 8-bit data to 7-bit data, a uniform interface is required. The iconv subroutines provide this interface.

Standard Converters

The system supports standard converters for use with the iconv command and subroutines. The following list describes the different types of converters:

Code Set Converter Types Description
Table converter Converts single-byte stateless code sets. Performs a table translation from one byte to another byte.
Algorithmic converter Performs a conversion that cannot be implemented using a simple single-byte mapping table. All multibyte converters are currently implemented in this way.
Interchange Converter Types Description
7-bit converter Converts between internal code sets and ISO2022 standard interchange formats (7-bit).
8-bit converter Converts between internal code sets and ISO2022 standard interchange formats (8-bit).
Compound Text converter Converts between compound text and internal code sets.
uucode converter Provides the same mapping as that defined in the uuencode and uudecode command.
UCS-2 converters Converts between UCS-2 and other code sets.
UTF-8 converters Converts between UTF-8 and other code sets.
Miscellaneous Converters Description
Miscellaneous converters Used by some of the converters listed above.

Understanding libiconv

The iconv application programming interface (API) consists of three subroutines that accomplish conversion:

iconv_open Performs the initialization required to convert characters from the code set specified by the FromCode parameter to the code set specified by the ToCode parameter. The strings specified are dependent on the converters installed in the system. If initialization is successful, the converter descriptor, iconv_t, is returned in its initial state.
iconv Invokes the converter function using the descriptor obtained from the iconv_open subroutine. The inbuf parameter points to the first character in the input buffer, and the inbytesleft parameter indicates the number of bytes to the end of the buffer being converted. The outbuf parameter points to the first available byte in the output buffer, and the outbytesleft parameter indicates the number of available bytes to the end of the buffer.

For state-dependent encoding, the subroutine is placed in its initial state by a call for which the inbuf value is a null pointer. Subsequent calls with the inbuf parameter as something other than a null pointer cause the internal state of the function to be altered as necessary.

iconv_close Closes the conversion descriptor specified by the cd variable and makes it usable again.

In a network environment, two factors determine how data should be converted:

The following table outlines the conversion methods and recommends how you should convert data in different situations. See the "List of Interchange Converters--7-bit" and the "List of Interchange Converters--8-bit" for more information.

Outline of Methods and Recommended Choices

Communication with system using the same code set Communication with system using different code set or receiver's code set is unknown

Protocol Protocol
Method to choose 7-bit only 8-bit 7-bit only 8-bit
as is Not valid Best choice Not valid Not valid if remote code set is unknown
fold7 OK OK Best choice OK
fold8 Not valid OK Not valid Best choice
uucode Best choice OK Not valid Not valid

If the sender uses the same code set as the receiver, there are two possibilities:

If the sender uses a code set different from the receiver, there are two possibilities:

Using the iconv_open Subroutine

The following examples illustrate how to use the iconv_open subroutine in different situations:

How the iconv_open Subroutine Finds Converters

The iconv_open subroutine uses the LOCPATH environment variable to search for a converter whose name is in the form:

iconv/FromCodeSet_ToCodeSet

The FromCodeSet string represents the sender's code set, and the ToCodeSet string represents the receiver's code set. The underscore character separates the two strings.

Note: All setuid and setgid programs will ignore the LOCPATH environment variable.

Since the iconv converter is aloadable object module, a different object is required when running in the 64-bit environment. In the 64-bit environment, the iconv_open routine will use the LOCPATH environment variable to search for a converter whose name is in the form:

iconv/FromCodeSet_ToCodeSet__64.

The iconv library will automatically choose whether to load the standard converter object or the 64-bit converter object.

If the iconv_open subroutine does not find the converter, it uses the from,to pair to search for a file that defines a table-driven conversion. The file contains a conversion table created by the genxlt command.

The iconvTable converter uses the LOCPATH environment variable to search for a file whose name is in the form:

iconvTable/FromCodeSet_ToCodeSet

If the converter is found, it performs a load operation and is initialized. The converter descriptor, iconv_t, is returned in its initial state.

Converter Programs versus Tables

Converter programs are executable functions that convert data according to a set of rules. Converter tables are single-byte conversion tables that perform stateless conversions. Programs and tables are in separate directories:

/usr/lib/nls/loc/iconv Converter programs
/usr/lib/nls/loc/iconvTable Converter tables.

After a converter program is compiled and linked with the libiconv.a library, the program is placed in the /usr/lib/nls/loc/iconv directory.

To build a table converter, build a source converter table file. Use the genxlt command to compile translation tables into a format understood by the table converter. The output file is then placed in the /usr/lib/nls/loc/iconvTable directory.

Unicode and Universal Converters

Unicode (or UCS-2) conversion tables are found in:

$LOCPATH/uconvTable/*CodeSet*

The $LOCPATH/uconv/UCSTBL converter program is used to perform the conversion to and from UCS-2 using the iconv utilities. For the iconv utilities to use these uconvTable conversion tables, links must be set up within the $LOCPATH/iconv directory, for example, for code set "X."

ln -s /usr/lib/nls/loc/uconv/UCSTBL /usr/lib/nls/loc/iconv/X_UCS-2
ln -s /usr/lib/nls/loc/uconv/UCSTBL /usr/lib/nls/loc/iconv/UCS-2_X

A "Universal converter" program is provided that can be used to convert between any two code sets whose conversions to and from UCS-2 is defined. Given the following uconvTables:

X     -> UCS-2
UCS-2 -> Y

a universal conversion can be defined that maps

X -> UCS-2 -> Y

by use of the $LOCPATH/iconv/Universal_UCS_Conv. The conversion X->Y is set by defining links to the universal converter, for example:

ln-s /usr/lib/nls/loc/iconv/Universal_UCS_Conv /usr/lib/nls/loc/iconv/X_Y

Using Converters

The iconv interface is a set of subroutines used to open, perform, and close conversions:

Code Set Conversion Filter Example

The following example shows how you can use these subroutines to create a code set conversion filter that accepts the ToCode and FromCode parameters as input arguments:

#include <stdio.h>
#include <nl_types.h>
#include <iconv.h>
#include <string.h>
#include <errno.h>
#include <locale.h>
 
#define ICONV_DONE() (r>=0)
#define ICONV_INVAL() (r<0) && (errno==EILSEQ))
#define ICONV_OVER() (r<0) && (errno==E2BIG))
#define ICONV_TRUNC() (r<0) && (errno==EINVAL))
 
#define USAGE 1
#define ERROR 2
#define INCOMP 3
 
char ibuf[BUFSIZ], obuf[BUFSIZ];
 
extern int errno;
 
main (argc,argv)
int argc;
char **argv;
{
 size_t  ileft,oleft;
 nl_catd catd;
 iconv_t cd;
 int r;
 char *ip,*op;
 
 setlocale(LC_ALL,"");
 catd = catopen (argv[0],0);
 
 if(argc!=3){
  fprintf(stderr,
   catgets (catd,NL_SETD,USAGE,"usage;conv fromcode tocode\n"));
  exit(1);
 }
 
 cd=iconv_open(argv[2],argv[1]);
 
ileft=0;
 
while(!feof(stdin)) {
 /*
 * After the next operation,ibuf will
 * contain new data plus any truncated
 * data left from the previous read.
 */
 ileft+=fread(ibuf+ileft,1,BUFSIZ-ileft,stdin);
 do {
  ip=ibuf;
  op=obuf;
  oleft=BUFSIZ;
  
  r=iconv(cd,&ip,&ileft,&op,&oleft);
  
  if(ICONV_INVAL()){
   fprintf(stderr,
      catgets(catd,NL_SETD,ERROR,"invalid input\n"));
   exit(2);
 }
  
 fwrite(obuf,1,BUFSIZ-oleft,stdout);
  
 if(ICONV_TRUNC() || ICONV_OVER())
  /*
  *Data remaining in buffer-copy
  *it to the beginning
  */
  
  memcpy(ibuf,ip,ileft);
 
  /*
  *loop until all characters in the input
  *buffer have been converted.
  */
 } while(ICONV_OVER());
}
 
 if(ileft!=0){
  /*
  *This can only happen if the last call
  *to iconv() returned ICONV_TRUNC, meaning
  *the last data in the input stream was
  *incomplete.
  */
 fprintf(stderr,catgets(catd,NL_SETD,INCOMP,"input incomplete\n"));
 exit(3);
 }
 
 iconv_close(cd);
 exit(0);
}

Naming Converters

Code set names are in the form CodesetRegistry-CodesetEncoding where:

CodesetRegistry Identifies the registration authority for the encoding. The CodesetRegistry must be made of characters from the portable code set (usually A-Z and 0-9).
CodesetEncoding Identifies the coded character set defined by the registered authority.

The from,to variable used by the iconv command and iconv_open subroutine identifies a file whose name should be in the form /usr/lib/nls/loc/iconv/%f_%t or /usr/lib/nls/loc/iconvTable/%f_%t, where:

%f Represents the FromCode set name.
%t Represents the ToCode set name.

List of Converters

Converters change data from one code set to another. The sets of converters supported with the ICONV library are in the following sections. All converters shipped with the BOS Runtime Environment are located in the /usr/lib/nls/loc/iconv/* or /usr/lib/nls/loc/iconvTable/* directory.

These directories also contain private converters; that is, they are used by other converters. However, users and programs should only depend on the converters in the following lists.

Any converter shipped with the BOS Runtime Environment and not listed here should be considered private and subject to change or deletion. Converters supplied by other products can be placed in the /usr/lib/nls/loc/iconv/* or /usr/lib/nls/loc/iconvTable/* directory.

Programmers are encouraged to use registered code set names or code set names associated with an application. The X Consortium maintains a registry of code set names for reference. See the "Code Set Overview" for more information about code sets.

List of PC, ISO, and EBCDIC Code Set Converters

These converters provide conversion between PC, ISO, and EBCDIC single-byte stateless code sets. The following types of conversions are supported: PC to/from ISO, PC to/from EBCDIC, and ISO to/from EBCDIC.

Conversion is provided between compatible code sets such as Latin-1 to Latin-1 and Greek to Greek. However, conversion between different EBCDIC national code sets is not supported. For information about converting between incompatible character sets refer to the "List of Interchange Converters--7-bit" and the "List of Interchange Converters--8-bit".

Conversion tables in the iconvTable directory are created by the genxlt command.

Compatible Code Set Names

The following table lists code set names that are compatible. Each line defines to/from strings that may be used when requesting a converter.

Note: The PC and ISO code sets are ASCII-based.
Code Set Compatibility
Character Set Languages PC ISO EBCDIC
Latin-1 U.S. English, Portuguese, Canadian French IBM-850 ISO8859-1 IBM-037
Latin-1 Danish, Norwegian IBM-850 ISO8859-1 IBM-277
Latin-1 Finnish, Swedish IBM-850 ISO8859-1 IBM-278
Latin-1 Italian IBM-850 ISO8859-1 IBM-280
Latin-1 Japanese IBM-850 ISO8859-1 IBM-281
Latin-1 Spanish IBM-850 ISO8859-1 IBM-284
Latin-1 U.K. English IBM-850 ISO8859-1 IBM-285
Latin-1 German IBM-850 ISO8859-1 IBM-273
Latin-1 French IBM-850 ISO8859-1 IBM-297
Latin-1 Belgian, Swiss German IBM-850 ISO8859-1 IBM-500
Latin-2 Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene IBM-852 ISO88859-2 IBM-870
Cyrillic Bulgarian, Macedonian, Serbian Cyrillic, Russian IBM-855 ISO8859-5 IBM-880 IBM-1025
Cyrillic Russian IBM-866 ISO8859-5 IBM-1025
Hebrew Hebrew IBM-856 IBM-862 ISO8859-8 IBM-424 IBM-803
Turkish Turkish IBM-857 ISO8859-9 IBM-1026
Arabic Arabic IBM-864 IBM-1046 ISO8859-6 IBM-420
Greek Greek IBM-869 ISO8859-7 IBM-875
Greek Greek IBM-869 ISO8859-7 IBM-875
Baltic Lithuanian, Latvian, Estonian IBM-921 IBM-922
IBM-1112 IBM-1122
Note: A character that exists in the source code set but does not exist in the target code set is converted to a converter-defined substitute character.
Files

The following table describes the inconvTable converters found in the /usr/lib/nls/loc/iconvTable directory:

iconvTable Converters
Converter Table Description Language
IBM-037_IBM-850 IBM-037 to IBM-850 U.S. English, Portuguese, Canadian-French
IBM-273_IBM-850 IBM-273 to IBM-850 German
IBM-277_IBM-850 IBM-277 to IBM-850 Danish, Norwegian
IBM-278_IBM-850 IBM-278 to IBM-850 Finnish, Swedish
IBM-280_IBM-850 IBM-280 to IBM-850 Italian
IBM-281_IBM-850 IBM-281 to IBM-850 Japanese-Latin
IBM-284_IBM-850 IBM-284 to IBM-850 Spanish
IBM-285_IBM-850 IBM-285 to IBM-850 U.K. English
IBM-297_IBM-850 IBM-297 to IBM-850 French
IBM-420_IBM_1046 IBM-420 to IBM-1046 Arabic
IBM-424_IBM-856 IBM-424 to IBM-856 Hebrew
IBM-424_IBM-862 IBM-424 to IBM-862 Hebrew
IBM-500_IBM-850 IBM-500 to IBM-850 Belgian, Swiss German
IBM-803_IBM-856 IBM-803 to IBM-856 Hebrew
IBM-803_IBM-862 IBM-803 to IBM-862 Hebrew
IBM-850_IBM-037 IBM-850 to IBM-037 U.S. English, Portuguese, Canadian-French
IBM-850_IBM-273 IBM-850 to IBM-273 German
IBM-850_IBM-277 IBM-850 to IBM-277 Danish, Norwegian
IBM-850_IBM-278 IBM-850 to IBM-278 Finnish, Swedish
IBM-850_IBM-280 IBM-850 to IBM-280 Italian
IBM-850_IBM-281 IBM-850 to IBM-281 Japanese-Latin
IBM-850_IBM-284 IBM-850 to IBM-284 Spanish
IBM-850_IBM-285 IBM-850 to IBM-285 U.K. English
IBM-850_IBM-297 IBM-850 to IBM-297 French
IBM-850_IBM-500 IBM-850 to IBM-500 Belgian, Swiss German
IBM-856_IBM-424 IBM-856 to IBM-424 Hebrew
IBM-856_IBM-803 IBM-856 to IBM-803 Hebrew
IBM-856_IBM-862 IBM-856 to IBM-862 Hebrew
IBM-862_IBM-424 IBM-862 to IBM-424 Hebrew
IBM-862_IBM-803 IBM-862 to IBM-803 Hebrew
IBM-862_IBM-856 IBM-862 to IBM-856 Hebrew
IBM-864_IBM-1046 IBM-864 to IBM-1046 Arabic
IBM-921_IBM-1112 IBM-921 to IBM-1112 Lithuanian, Latvian
IBM-922_IBM-1122 IBM-922 to IBM-1122 Estonian
IBM-1112_IBM-921 IBM-1121 to IBM-921 Lithuanian, Latvian
IBM-1122_IBM-922 IBM-1122 to IBM-922 Estonian
IBM-1046_IBM-420 IBM-1046 to IBM-420 Arabic
IBM-1046_IBM-864 IBM-1046 to IBM-864 Arabic
IBM-037_ISO8859-1 IBM-037 to ISO8859-1 U.S. English, Portuguese, Canadian French
IBM-273_ISO8859-1 IBM-273 to ISO8859-1 German
IBM-277_ISO8859-1 IBM-277 to ISO8859-1 Danish, Norwegian
IBM-278_ISO8859-1 IBM-278 to ISO8859-1 Finnish, Swedish
IBM-280_ISO8859-1 IBM-280 to ISO8859-1 Italian
IBM-281_ISO8859-1 IBM-281 to ISO8859-1 Japanese-Latin
IBM-284_ISO8859-1 IBM-284 to ISO8859-1 Spanish
IBM-285_ISO8859-1 IBM-285 to ISO8859-1 U.K. English
IBM-297_ISO8859-1 IBM-297 to ISO8859-1 French
IBM-420_ISO8859-6 IBM-420 to ISO8859-6 Arabic
IBM-424_ISO8859-8 IBM-424 to ISO8859-8 Hebrew
IBM-500_ISO8859-1 IBM-500 to ISO8859-1 Belgian, Swiss German
IBM-803_ISO8859-8 IBM-803 to ISO8859-8 Hebrew
IBM-852_ISO8859-2 IBM-852 to ISO8859-2 Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
IBM-855_ISO8859-5 IBM-855 to ISO8859-5 Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-866_ISO8859-5 IBM-866 to ISO8859-5 Russian
IBM-869_ISO8859-7 IBM-869 to ISO8859-7 Greek
IBM-875_ISO8859-7 IBM-875 to ISO8859-7 Greek
IBM-870_ISO8859-2 IBM-870 to ISO8859-2 Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian, Slovak, Slovene
IBM-880_ISO8859-5 IBM-880 to ISO8859-5 Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-1025_ISO8859-5 IBM-1025 to ISO8859-5 Bulgarian, Macedonian, Serbian Cyrillic, Russian
IBM-857_ISO8859-9 IBM-857 to ISO8859-9 Turkish
IBM-1026_ISO8859-9 IBM-1026 to ISO8859-9 Turkish
IBM-850_ISO8859-1 IBM-850 to ISO8859-1 Latin
IBM-856_ISO8859-8 IBM-856 to ISO8859-8 Hebrew
IBM-862_ISO8859-8 IBM-862 to ISO8859-8 Hebrew
IBM-864_ISO8859-6 IBM-864 to ISO8859-6 Arabic
IBM-1046_ISO8859-6 IBM-1046 to ISO8859-6 Arabic
ISO8859-1_IBM-850 ISO8859-1 to IBM-850 Latin
ISO8859-6_IBM-864 ISO8859-6 to IBM-864 Arabic
ISO8859-6_IBM-1046 ISO8859-6 to IBM-1046 Arabic
ISO8859-8_IBM-856 ISO8859-8 to IBM-856 Hebrew
ISO8859-8_IBM-862 ISO8859-8 to IBM-862 Hebrew
ISO8859-1_IBM-037 ISO8859-1 to IBM-037 U.S. English, Portuguese, Canadian French
ISO8859-1_IBM-273 ISO8859-1 to IBM-273 German
ISO8859-1_IBM-277 ISO8859-1 to IBM-277 Danish, Norwegian
ISO8859-1_IBM-278 ISO8859-1 to IBM-278 Finnish, Swedish
ISO8859-1_IBM-280 ISO8859-1 to IBM-280 Italian
ISO8859-1_IBM-281 ISO8859-1 to IBM-281 Japanese-Latin
ISO8859-1_IBM-284 ISO8859-1 to IBM-284 Spanish
ISO8859-1_IBM-285 ISO8859-1 to IBM-285 U.K. English
ISO8859-1_IBM-297 ISO8859-1 to IBM-297 French
ISO8859-1_IBM-500 ISO8859-1 to IBM-500 Belgian, Swiss German
ISO8859-2_IBM-852 ISO8859-2 to IBM-852 Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
ISO8859-2_IBM-870 ISO8859-2 to IBM-870 Croatian, Czechoslovakian, Hungarian, Polish, Romanian, Serbian Latin, Slovak, Slovene
ISO8859-5_IBM-855 ISO8859-5 to IBM-855 Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-5_IBM-880 ISO8859-5 to IBM-880 Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-5_IBM-1025 ISO8859-5 to IBM-1025 Bulgarian, Macedonian, Serbian Cyrillic, Russian
ISO8859-6_IBM-420 ISO8859-6 to IBM-420 Arabic
ISO8859-5_IBM-866 ISO8859-5 to IBM-866 Russian
ISO8859-7_IBM-869 ISO8859-7 to IBM-869 Greek
ISO8859-7_IBM-875 ISO8859-7 to IBM-875 Greek
ISO8859-8_IBM-424 ISO8859-8 to IBM-424 Hebrew
ISO8859-8_IBM-803 ISO8859-8 to IBM-803 Hebrew
ISO8859-9_IBM-857 ISO8859-9 to IBM-857 Turkish
ISO8859-9_IBM-1026 ISO8859-9 to IBM-1026 Turkish

List of Multibyte Code Set Converters

Multibyte code-set converters convert characters among the following code-sets:

The following table lists code set names that are compatible. Each line defines to/from strings that may be used when requesting a converter.

Code Set Compatibility
Language PC ISO EBCDIC
Japanese IBM-932 IBM-eucJP IBM-930, IBM-939
Japanese
(MS compatible)
IBM-943 IBM-eucJP IBM-930, IBM-939
Korean IBM-934 IBM-eucKR IBM-933
Traditional Chinese IBM-938, big-5 IBM-eucTW IBM-937
Simplified Chinese IBM-1381 IBM-eucCN IBM-935
  1. Conversions between Simplified and Traditional Chinese are provided (IBM-eucTW <--> IBM-eucCN and big5 <--> IBM-eucCN).

  2. UTF-8 is an additional code set. See "List of UTF-8 Interchange Converters" for more information.
Files

The following list describes the Multibyte Code Set converters that are found in the /usr/lib/nls/loc/iconv directory.

Converter Description
IBM-eucJP_IBM-932 IBM-eucJP to IBM-932
IBM-eucJP_IBM-943 IBM-eucJP to IBM-943
IBM-eucJP_IBM-930 IBM-eucJP to IBM-930
IBM-eucCN_IBM-936(PC5550) IBM-eucCN to IBM-936(PC5550)
IBM-eucCN_IBM-935 IBM-eucCN to IBM-935
IBM-eucJP_IBM-939 IBM-eucJP to IBM-939
IBM-eucCN_IBM-1381 IBM-eucCN to IBM-1381
IBM-943_IBM-932 IBM-943 to IBM-932
IBM-932_IBM-943 IBM-932 to IBM-943
IBM-930_IBM-932 IBM-930 to IBM-932
IBM-930_IBM-943 IBM-930 to IBM-943
IBM-930_IBM-eucJP IBM-930 to IBM-eucJP
IBM-932_IBM-eucJP IBM-932 to IBM-eucJP
IBM-932_IBM-930 IBM-932 to IBM-930
IBM-943_IBM-eucJP IBM-943 to IBM-eucJP
IBM-943_IBM-930 IBM-943 to IBM-930
IBM-936(PC5550)_IBM-935 IBM-936(PC5550) to IBM-935
IBM-936_IBM-935 IBM-936 to IBM-935
IBM-932_IBM-939 IBM-932 to IBM-939
IBM-939_IBM-932 IBM-939 to IBM-932
IBM-943_IBM-939 IBM-943 to IBM-939
IBM-939_IBM-943 IBM-939 to IBM-943
IBM-935_IBM-936(PC5550) IBM-935 to IBM-936(PC5550)
IBM-935_IBM-936 IBM-935 to IBM-936
IBM-1381_IBM-935 IBM-1381 to IBM-935
IBM-935_IBM-1381 IBM-935 to IBM-1381
IBM-935_IBM-eucCN IBM-935 to IBM-eucCN
IBM-936(PC5550)_IBM-eucCN IBM-936(PC5550) to IBM-eucCN
IBM-eucTW_IBM-eucCN IBM-eucTW to IBM-eucCN
big5_IBM-eucCN big5 to IBM-eucCN
IBM-1381_IBM-eucCN IBM-1381 to IBM-eucCN
IBM-939_IBM-eucJP IBM-939 to IBM-eucJP
IBM-eucKR_IBM-934 IBM-eucKR to IBM-934
IBM-934_IBM-eucKR IBM-934 to IBM-eucKR
IBM-eucKR_IBM-933 IBM-eucKR to IBM-933
IBM-933_IBM-eucKR IBM-933 to IBM-eucKR
IBM-eucTW_IBM-937 IBM-eucTW to IBM-937
IBM-938_IBM-937 IBM-938 to IBM-937
big-5_IBM-937 big-5 to IBM-937
IBM-eucCN_IBM-eucTW IBM-eucCN to IBM-eucTW
IBM-937_IBM-eucTW IBM-937 to IBM-eucTW
IBM-937_IBM-938 IBM-937 to IBM-938
IBM-eucTW_IBM-938 IBM_eucTW to IBM_938
IBM-eucCN_big5 IBM-eucCN to big5
IBM-eucTW_big-5 IBM_eucTW to big-5
IBM-937_big-5 IBM-937 to big-5
CNS11643.1992-3_IBM-eucTW CNS11643.1992-3 to IBM_eucTW
CNS11643.1992-3-GL_IBM-eucTW CNS11643.1992-3-GL to IBM_eucTW
CNS11643.1992-3-GR_IBM-eucTW CNS11643.1992-3-GR to IBM_eucTW
CNS11643.1992-4_IBM-eucTW CNS11643.1992-4 to IBM_eucTW
CNS11643.1992-4-GL_IBM-eucTW CNS11643.1992-4-GL to IBM_eucTW
CNS11643.1992-4-GR_IBM-eucTW CNS11643.1992-4-GR to IBM_eucTW
IBM-eucTW_CNS11643.1992-3 IBM_eucTW to CNS11643.1992-3
IBM-eucTW_CNS11643.1992-3-GL IBM_eucTW to CNS11643.1992-3-GL
IBM-eucTW_CNS11643.1992-3-GR IBM_eucTW to CNS11643.1992-3-GR
IBM-eucTW_CNS11643.1992-4 IBM_eucTW to CNS11643.1992-4
IBM-eucTW_CNS11643.1992-4-GL IBM_eucTW to CNS11643.1992-4-GL
IBM-eucTW_CNS11643.1992-4-GR IBM_eucTW to CNS11643.1992-4-GR
IBM-eucCN_GB2312.1980-1 IBM-eucCN to GB2312.1980-1
IBM-eucCN_GB2312.1980-1-GL IBM-eucCN to GB2312.1980-1-GL
IBM-eucCN_GB2312.1980-1-GR IBM-eucCN to GB2312.1980-1-GR
IBM-937_csic IBM-937 to csic
csic_IBM-937 csic to IBM-937
IBM-938_csic IBM-938 to csic
csic_IBM-938 csic to IBM-938
IBM-eucTW_ccdc IBM-eucTW to ccdc
ccdc_IBM-eucTW ccdc to IBM-eucTW
IBM-eucTW_cns IBM-eucTW to cns
cns_IBM-eucTW cnd to IBM-eucTW
IBM-eucTW_csic IBM-eucTW to csic
csic_IBM-eucTW csic to IBM-eucTW
IBM-eucTW_sops IBM-ecuTW to sops
sops_IBM-eucTW sops to IBM-eucTW
IBM-eucTW_tca IBM-eucTW to tca
tca_IBM-eucTW tca to IBM-eucTW
big5_cns big5 to cns
cns_big5 cns to big5
big5_csic big5 to csic
csic_big5 csic to big5
big5_ttc big5 to ttc
ttc_big5 ttc to big5
big5_ttcmin big5 to ttcmin
ttcmin_big5 ttcmin to big5
big5_unicode big5 to unicode
unicode_big5 unicode to big5
big5_wang big5 to wang
wang_big5 wang to big5
ccdc_csic ccdc to csic
csic_ccdc csic to_ccdc
csic_sops csic to sops
sops_csic sops to csic
CNS11643.1986-1_big5 CNS11643.1986-1 to big5
big5_CNS11643.1986-1 big5 to CNS11643.1986-1
CNS11643.1986-1-GR_big5 CNS11643.1986-1-GR to big5
big5_CNS11643.1986-1-GR big5 to CNS11643.1986-1-GR
CNS11643.1986-2_big5 CNS11643.1986-2 to big5
big5_CNS11643.1986-2 big5 to CNS11643.1986-2
CNS11643.1986-2-GR_big5 CNS11643.1986-2-GR to big5
big5_CNS11643.1986-2-GR big5 to CNS11643.1986-2-GR
CNS11643.CT-GR_big5 CNS11643.CT-GR to big5
big5_CNS11643.CT-GR big5 to CNS11643.CT-GR
IBM-sbdTW-GR_big5 IBM-sbdTW-GR to big5
big5_IBM-sbdTW-GR big5 to IBM-sbdTW-GR
IBM-sbdTW.CT-GR_big5 IBM-sbdTW.CT-GR to big5
big5_IBM-sbdTW.CT-GR big5 to IBM-sbdTW.CT-GR
IBM-sbdTW_big5 IBM-sbdTW to big5
big5_IBM-sbdTW big5 to IBM-sbdTW
IBM-udcTW-GR_big5 IBM-udcTW-GR to big5
big5_IBM-udcTW-GR big5 to IBM-udcTW-GR
IBM-udcTW.CT-GR_big5 IBM-udcTW.CT-GR to big5
big5_IBM-udcTW.CT-GR big5 to IBM-udcTW.CT-GR
ISO8859-1_big5 ISO8859 to big5
big5_ISO8859-1 big5 to ISO8859
IBM-sbdTW_big5 IBM-sbdTW to big5
big5_IBM-sbdTW big5 to IBM-sbdTW
big5_ASCII-GR big5 to ASCII-GR
ASCII-GR_big5 ASCII-GR to big5
GBK_big5 GBK to big5
big5_GBK big5 to GBK
GBK_IBM-eucTW GBK to IBM-eucTW
IBM-eucTW_GBK IBM-eucTW to GBK
CNS11643.1986-1_GBK CNS11643.1986-1 to GBK
GBK_CNS11643.1986-1 GBK to CNS11643.1986-1
CNS11643.1986-2_GBK CNS11643.1986-2 to GBK
GBK_CNS11643.1986-2 GBK to CNS11643.1986-2
CNS11643.1986-1-GR_GBK CNS11643.1986-1-GR to GBK
GBK_CNS11643.1986-1-GR GBK to CNS11643.1986-1-GR
CNS11643.1986-2-GR_GBK CNS11643.1986-2-GR to GBK
GBK_CNS11643.1986-2-GR GBK to CNS11643.1986-2-GR
CNS11643.1986-1-GL_GBK CNS11643.1986-1-GL to GBK
GBK_CNS11643.1986-1-GL GBK to CNS11643.1986-1-GL
CNS11643.1986-2-GL_GBK CNS11643.1986-2-GL to GBK
GBK_CNS11643.1986-2-GL GBK to CNS11643.1986-2-GL
CNS11643.CT-GR_GBK CNS11643.CT-GR to GBK
GBK_CNS11643.CT-GR GBK to CNS11643.CT-GR
GB2312.1980.CT-GR_GBK GB2312.1980.CT-GR to GBK
GBK_GB2312.1980.CT-GR GBK to GB2312.1980.CT-GR
GB2312.1980-0_GBK GBK2312.1980-0 to GBK
GBK_GB2312.1980-0 GBK to GBK2312.1980-0
GB2312.1980-0-GR_GBK GB2312.1980-0-GR to GBK
GBK_GB2312.1980-0-GR GBK to GB2312.1980-0-GR
GB2312.1980-0-GL_GBK GB2312.1980-0-GL to GBK
GBK_GB2312.1980-0-GL GBK to GB2312.1980-0-GL
ASCII-GR_GBK ASCII-GR to GBK
GBK_ASCII-GR GBK to ASCII-GR
ISO8859-1_GBK ISO8859-1 to GBK
GBK_ISO8859-1 GBK to ISO8859-1
IBM-eucCN_GBK IBM-eucCN to GBK
GBK_IBM-eucCN GBK to IBM-eucCN

List of Interchange Converters--7-bit

This converter provides conversion between internal code and 7-bit standard interchange formats (fold7). The fold7 name identifies encodings that can be used to pass text data through 7-bit mail protocols. The encodings are based on ISO2022. For more information about fold7 see "Understanding libiconv".

The fold7 converters convert characters from a code set to a canonical 7-bit encoding that identifies each character. This type of conversion is useful in networks where clients communicate with different code sets but use the same character sets. For example:

IBM-850 <--> ISO8859-1 Common Latin characters
IBM-932 <-->IBM-eucJP Common Japanese characters

The following escape sequences designate standard code sets:

Escape Sequence Standard Code Set
01/11 02/04 04/00 GL JIS X0208.1978-0.
01/11 02/04 02/08 04/01 GL left half of GB2312.1980-0.
01/11 02/08 04/02 GL 7-bit ASCII or left half of ISO8859-1.
01/11 02/14 04/01 GL right half of ISO8859-1.
01/11 02/14 04/02 GL right half of ISO8859-2.
01/11 02/14 04/03 GL right half of ISO8859-3.
01/11 02/14 04/04 GL right half of ISO8859-4.
01/11 02/14 04/06 GL right half of ISO8859-7.
01/11 02/14 04/07 GL right half of ISO8859-6.
01/11 02/14 04/08 GL right half of ISO8859-8.
01/11 02/14 04/12 GL right half of ISO8859-5.
01/11 02/14 04/13 GL right half of ISO8859-9.
01/11 02/08 04/09 GL right half of JIS X0201.1976-0.
01/11 02/08 04/10 GL left half of JIS X0201.1976.
01/11 02/04 04/02 GL JIS X0208.1983-0.
01/11 02/04 02/08 04/02 GL JIS X0208.1983-0.
01/11 02/04 02/08 04/00 GL JISX0208.1978-0.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02 GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02 GL Japanese) IBM-udcJP) user-definable characters.
01/11 02/04 02/08 04/03 GL KSC5601-1987.
01/11 02/04 02/09 03/00 GL CNS11643-1986-1.
01/11 02/04 02/10 03/01 GL CNS11643-1986-2.
01/11 02/05 02/15 03/00 M L 05/05 05/04 04/06 02/13 03/07 00/02 UCS-2 encoded as base64; used only for those characters not encoded by any of the other 7-bit escape sequences listed above.

When converting from a code set to fold7, the escape sequence used to designate the code set is chosen according to the order listed. For example, the JISX0208.1983-0 characters use 01/11 01/04 04/02 as the designation.

Files

The following list describes the fold7 converters that are found in the /usr/lib/nls/loc/iconv directory:

Converter Description
fold7_IBM-850 Interchange format to IBM-850
fold7_IBM-921 Interchange format to IBM-921
fold7_IBM-922 Interchange format to IBM-922
fold7_IBM-932 Interchange format to IBM-932
fold7_IBM-943 Interchange format to IBM-943
fold7_IBM_1124 Interchange format to IBM-1124
fold7_IBM_1129 Interchange format to IBM-1129
fold7_IBM_eucCN Interchange format to IBM-eucCN
fold7_IBM-eucJP Interchange format to IBM-eucJP
fold7_IBM-eucKR Interchange format to IBM-eucKR
fold7_IBM-eucTW Interchange format to IBM-eucTW
fold7_ISO8859-1 Interchange format to ISO8859-1
fold7_ISO8859-2 Interchange format to ISO8859-2
fold7_ISO8859-3 Interchange format to ISO8859-3
fold7_ISO8859-4 Interchange format to ISO8859-4
fold7_ISO8859-5 Interchange format to ISO8859-5
fold7_ISO8859-6 Interchange format to ISO8859-6
fold7_ISO8859-7 Interchange format to ISO8859-7
fold7_ISO8859-8 Interchange format to ISO8859-8
fold7_ISO8859-9 Interchange format to ISO8859-9
fold7_TIS-620 Interchange format to TIS-620
fold7_UTF-8 Interchange format to UTF-8
fold7_big5 Interchange format to big5
fold7_GBK Interchange format to GBK
IBM-921_fold7 IBM-921 to interchange format
IBM-922_fold7 IBM-922 to interchange format
IBM-850_fold7 IBM-850 to interchange format
IBM-932_fold7 IBM-932 to interchange format
IBM-943_fold7 IBM-943 to interchange format
IBM-1124_fold7 IBM-1124 to interchange format
IBM-1129_fold7 IBM-1129 to interchange format
IBM-eucCN_fold7 IBM-eucCN to interchange format
IBM-eucJP_fold7 IBM-eucJP to interchange format
IBM-eucKR_fold7 IBM-eucKR to interchange format
IBM-eucTW_fold7 IBM-eucTW to interchange format
ISO8859-1_fold7 ISO8859-1 to interchange format
ISO8859-2_fold7 ISO8859-2 to interchange format
ISO8859-3_fold7 ISO8859-3 to interchange format
ISO8859-4_fold7 ISO8859-4 to interchange format
ISO8859-5_fold7 ISO8859-5 to interchange format
ISO8859-6_fold7 ISO8859-6 to interchange format
ISO8859-7_fold7 ISO8859-7 to interchange format
ISO8859-8_fold7 ISO8859-8 to interchange format
ISO8859-9_fold7 ISO8859-9 to interchange format
TIS-620_fold7 TIS-620 to interchange format
UTF-8_fold7 UTF-8 to interchange format
big5_fold7 big5 to interchange format
GBK_fold7 GBK to interchange format

List of Interchange Converters--8-bit

This converter provides conversions between internal code and 8-bit standard interchange formats (fold8). The fold8 name identifies encodings that can be used to pass text data through 8-bit mail protocols. The encodings are based on ISO2022. For more information about fold8 see "Understanding libiconv".

The fold8 converters convert characters from a specific code set encoding to a canonical 8-bit encoding that identifies each character. This type of conversion is useful in networks where clients communicate with different code sets but use the same character sets. For example:

IBM-850 <--> ISO8859-1 Common Latin characters
IBM-932 <-->IBM-eucJP Common Japanese characters

The following escape sequences designate standard code sets.

Escape Sequence Standard Code Set
01/11 02/04 02/09 04/01 GR right half of GB2312.1980-0.
01/11 02/13 04/01 GR right half of ISO8859-1.
01/11 02/13 04/02 GR right half of ISO8859-2.
01/11 02/13 04/03 GR right half of ISO8859-3.
01/11 02/13 04/04 GR right half of ISO8859-4.
01/11 02/13 04/06 GR right half of ISO8859-7.
01/11 02/13 04/07 GR right half of ISO8859-6.
01/11 02/13 04/08 GR right half of ISO8859-8.
01/11 02/13 04/13 GR right half of ISO8859-5.
01/11 02/13 04/13 GR right half of ISO8859-9.
01/11 02/09 04/09 GR right half of JIS X0201.1976-1.
01/11 02/04 02/09 04/02 GR JIS X0208.1983-1.
01/11 02/04 02/09 04/00 GR JISX0208.1978-1.
01/11 02/09 04/02 GR 7-bit ASCII or left half of ISO8859-1.
01/11 02/05 02/15 03/01 M L 04/09 04/02 04/13 02/13 03/08 03/05 03/00 00/02 GR right half of IBM-850 unique characters. Characters common to ISO8859-1 should not use this escape sequence.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02 GR right half of Japanese user-definable characters.
01/11 02/08 04/02 GL 7-bit ASCII or left half of ISO8859-1.
01/11 02/14 04/01 GL right half of ISO8859-1.
01/11 02/14 04/02 GL right half of ISO8859-2.
01/11 02/14 04/03 GL right half of ISO8859-3.
01/11 02/14 04/04 GL right half of ISO8859-4.
01/11 02/14 04/06 GL right half of ISO8859-7.
01/11 02/14 04/07 GL right half of ISO8859-6.
01/11 02/14 04/08 GL right half of ISO8859-8.
01/11 02/14 04/12 GL right half of ISO8859-5.
01/11 02/14 04/13 GL right half of ISO8859-9.
01/11 02/08 04/09 GL right half of JIS X0201.1976-0.
01/11 02/08 04/10 GL left half of JIS X0201.1976.
01/11 02/04 02/08 04/02 GL JIS X0208.1983-0.
01/11 02/04 04/02 GL JIS X0208.1983-0.
01/11 02/04 04/00 GL JIS X0208.1978-0.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02 GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02 GL Japanese (IBM-udcJP) user-definable characters.
01/11 02/04 02/09 04/03 GR KSC5601-1987.
01/11 02/04 02/09 03/00 GR CNS11643-1986-1.
01/11 02/04 02/10 03/01 GR CNS11643-1986-2.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 05/05 05/08 00/02 GR right half of Traditional Chinese user-definable characters.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/03 06/02 06/04 05/05 05/08 00/02 GR right half of IBM-850 unique symbols.
01/11 02/04 02/08 04/03 GL KSC5601-1987.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 05/05 05/08 00/02 GL Traditional Chinese (IBM-udcTW) user-definable characters.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/03 06/02 06/04 05/05 05/08 00/02 GL Traditional Chinese IBM-850 unique symbols (IBM-shdTW) user-definable characters.
01/11 02/05 02/15 03/00 M L 05/05 05/04 04/06 02/13 03/08 00/02 UCS-2 encoded as UTF-8; used only for those characters not encoded by any of the above escape sequences listed above.

When converting from a code set to fold8, the escape sequence used to designate the code set is chosen according to the order listed. For example, the JISX0208.1983-0 characters use 01/11 02/04 02/08 04/02 as the designation.

Files

The following list describes the fold8 converters found in the /usr/lib/nls/loc/iconv directory:

Converter Description
fold8_IBM-850 Interchange format to IBM-850
fold8_IBM-921 Interchange format to IBM-921
fold8_IBM-922 Interchange format to IBM-922
fold8_IBM-932 Interchange format to IBM-932
fold8_IBM-943 Interchange format to IBM-943
fold8_IBM-1124 Interchange format to IBM-1124
fold8_IBM-1129 Interchange format to IBM-1129
fold8_IBM-eucCN Interchange format to IBM-eucCN
fold8_IBM-eucJP Interchange format to IBM-eucJP
fold8_IBM-eucKR Interchange format to IBM-eucKR
fold8_IBM-eucTW Interchange format to IBM-eucTW
fold8_IBM-eucCN Interchange fromat to IBM-eucCN
fold8_ISO8859-1 Interchange format to ISO8859-1
fold8_ISO8859-2 Interchange format to ISO8859-2
fold8_ISO8859-3 Interchange format to ISO8859-3
fold8_ISO8859-4 Interchange format to ISO8859-4
fold8_ISO8859-5 Interchange format to ISO8859-5
fold8_ISO8859-6 Interchange format to ISO8859-6
fold8_ISO8859-7 Interchange format to ISO8859-7
fold8_ISO8859-8 Interchange format to ISO8859-8
fold8_ISO8859-9 Interchange format to ISO8859-9
fold8_TIS-620 Interchange format to TIS-620
fold8_UTF-8 Interchange format to UTF-8
fold8_big5 Interchange format to big5
fold8_GBK Interchange format to GBK
IBM-921_fold8 IBM-921 to interchange format
IBM-922_fold8 IBM-922 to interchange format
IBM-850_fold8 IBM-850 to interchange format
IBM-932_fold8 IBM-932 to interchange format
IBM-943_fold8 IBM-943 to interchange format
IBM-1124_fold8 IBM-1124 to interchange format
IBM-1129_fold8 IBM-1129 to interchange format
IBM-eucCN_fold8 IBM-eucCN to interchange format
IBM-eucJP_fold8 IBM-eucJP to interchange format
IBM-eucKR_fold8 IBM-eucKR to interchange format
IBM-eucTW_fold8 IBM-eucTW to interchange format
IBM-eucCN_fold8 IBM-eucCN to interchange format
ISO8859-1_fold8 ISO8859-1 to interchange format
ISO8859-2_fold8 ISO8859-2 to interchange format
ISO8859-3_fold8 ISO8859-3 to interchange format
ISO8859-4_fold8 ISO8859-4 to interchange format
ISO8859-5_fold8 ISO8859-5 to interchange format
ISO8859-6_fold8 ISO8859-6 to interchange format
ISO8859-7_fold8 ISO8859-7 to interchange format
ISO8859-8_fold8 ISO8859-8 to interchange format
ISO8859-9_fold8 ISO8859-9 to interchange format
TIS-620_fold8 TIS-620 to interchange format
UTF-8_fold8 UTF-8 to interchange format
big5_fold8 big5 to interchange format
GBK_fold8 GBK to interchange format

List of Interchange Converters--Compound Text

Compound text interchange converters convert between compound text and internal code sets.

Compound text is an interchange encoding defined by the X Consortium. It is used to communicate text between X clients. Compound text is based on ISO2022 and can encode most character sets using standard escape sequences. It also provides extensions for encoding private character sets. The supported code sets provide a converter to and from compound text. The name used to identify the compound text encoding is ct.

The following escape sequences are used to designate standard code sets in the order listed below.

01/11 02/05 02/15 03/01 M L 04/09 04/02 04/13 02/13 03/08 03/05 03/00 00/02
                          GR right half of IBM-850 unique characters. Characters common to ISO8859-1 should not use this escape sequence.
01/11 02/05 02/15 03/02 M L 04/09 04/02 04/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02
                          GR right half of Japanese user-definable characters.
01/11 02/05 02/15 03/01 M L 06/09 06/02 06/13 02/13 03/08 03/05 03/00 00/02
                          GL right half of IBM-850 unique characters. Characters common to ISO8859-1 do not use this escape sequence.
01/11 02/05 02/15 03/02 M L 06/09 06/02 06/13 02/13 07/05 06/04 06/03 04/10 05/00 00/02
                          GL Japanese (IBM-udcJP) user-definable characters.
Files

The following list describes the compound text converters that are found in the /usr/lib/nls/loc/iconv directory:

Converter Description
ct_IBM-850 Interchange format to IBM-850
ct_IBM-921 Interchange format to IBM-921
ct_IBM-922 Interchange format to IBM-922
ct_IBM-932 Interchange format to IBM-932
ct_IBM-943 Interchange format to IBM-943
ct_IBM-1124 Interchange format to IBM-1124
ct_IBM-1129 Interchange format to IBM-1129
ct_IBM-eucCN Interchange format to IBM-eucCN
ct_IBM-eucJP Interchange format to IBM-eucJP
ct_IBM-eucKR Interchange format to IBM-eucKR
ct_IBM-eucTW Interchange format to IBM-eucTW
ct_ISO8859-1 Interchange format to ISO8859-1
ct_ISO8859-2 Interchange format to ISO8859-2
ct_ISO8859-3 Interchange format to ISO8859-3
ct_ISO8859-4 Interchange format to ISO8859-4
ct_ISO8859-5 Interchange format to ISO8859-5
ct_ISO8859-6 Interchange format to ISO8859-6
ct_ISO8859-7 Interchange format to ISO8859-7
ct_ISO8859-8 Interchange format to ISO8859-8
ct_ISO8859-9 Interchange format to ISO8859-9
ct_TIS-620 Interchange format to TIS-620
ct_big5 Interchange format to big5
ct_GBK Interchange format to GBK
IBM-850_ct IBM-850 to interchange format
IBM-921_ct IBM-921 to interchange format
IBM-922_ct IBM-922 to interchange format
IBM-932_ct IBM-932 to interchange format
IBM-943_ct IBM-943 to interchange format
IBM-1124_ct IBM-1124 to interchange format
IBM-1129_ct IBM-1129 to interchange format
IBM-eucCN_ct IBM-eucCN to interchange format
IBM-eucJP_ct IBM-eucJP to interchange format
IBM-eucKR_ct IBM-eucKR to interchange format
IBM-eucTW_ct IBM-eucTW to interchange format
ISO8859-1_ct ISO8859-1 to interchange format
ISO8859-2_ct ISO8859-2 to interchange format
ISO8859-3_ct ISO8859-3 to interchange format
ISO8859-4_ct ISO8859-4 to interchange format
ISO8859-5_ct ISO8859-5 to interchange format
ISO8859-6_ct ISO8859-6 to interchange format
ISO8859-7_ct ISO8859-7 to interchange format
ISO8859-8_ct ISO8859-8 to interchange format
ISO8859-9_ct ISO8859-9 to interchange format
TIS-620_ct TIS-620 to interchange format
big5_ct big5 to interchange format
GBK_ct GBK to interchange format

List of Interchange Converters--uucode

This converter provides the same mapping as the uuencode and uudecode Command.

During conversion from uucode, 62 bytes at a time (including a new-line character trailing the record) are converted, and generating 45 bytes in outbuf.

Files

The following list describes the uucode converters found in the /usr/lib/nls/loc/iconv directory:

Converter Description
IBM-850_uucode IBM-850 to uucode
IBM-921_uucode IBM-921 to uucode
IBM-922_uucode IBM-922 to uucode
IBM-932_uucode IBM-932 to uucode
IBM-943_uucode IBM-943 to uucode
IBM-1124_uucode IBM-1124 to uucode
IBM-1129_uucode IBM-1129 to uucode
IBM-eucJP_uucode IBM-eucJP to uucode
IBM-eucKR_uucode IBM-eucKR to uucode
IBM-eucTW_uucode IBM-eucTW to uucode
IBM-eucCN_uucode IBM-eucCN to uucode
ISO8859-1_uucode ISO8859-1 to uucode
ISO8859-2_uucode ISO8859-2 to uucode
ISO8859-3_uucode ISO8859-3 to uucode
ISO8859-4_uucode ISO8859-4 to uucode
ISO8859-5_uucode ISO8859-5 to uucode
ISO8859-6_uucode ISO8859-6 to uucode
ISO8859-7_uucode ISO8859-7 to uucode
ISO8859-8_uucode ISO8859-8 to uucode
ISO8859-9_uucode ISO8859-9 to uucode
TIS-620_uucode TIS-620 to uucode
big5_uucode big5 to uucode
GBK_uucode GBK to uucode
uucode_IBM-850 uucode to IBM-850
uucode_IBM-921 uucode to IBM-921
uucode_IBM-922 uucode to IBM-922
uucode_IBM-932 uucode to IBM-932
uucode_IBM-943 uucode to IBM-943
uucode_IBM-1124 uucode to IBM-1124
uucode_IBM-1129 uucode to IBM-1129
uucode_IBM-eucCN uucode to IBM-eucCN
uucode_IBM-eucJP uucode to IBM-eucJP
uucode_IBM-eucKR uucode to IBM-eucKR
uucode_IBM-eucTW uucode to IBM-eucTW
uucode_ISO8859-1 uucode to ISO8859-1
uucode_ISO8859-2 uucode to ISO8859-2
uucode_ISO8859-3 uucode to ISO8859-3
uucode_ISO8859-4 uucode to ISO8859-4
uucode_ISO8859-5 uucode to ISO8859-5
uucode_ISO8859-6 uucode to ISO8859-6
uucode_ISO8859-7 uucode to ISO8859-7
uucode_ISO8859-8 uucode to ISO8859-8
uucode_ISO8859-9 uucode to ISO8859-9
uucode_TIS-1124 uucode to TIS-1129
uucode_big5 uucode to big5
uucode_GBK uucode to GBK

List of UCS-2 Interchange Converters

UCS-2 is a universal, 16-bit encoding described in the "Code Set Overview" . Conversions for each code set are provided in both directions, between the code set and UCS-2.

UCS-2 converters are found in /usr/lib/nls/loc/uconvTable and /usr/lib/nls/loc/uconv directories. The uconvdef command is used to generate new converters or to customize existing UCS-2 converters.

The /usr/lib/nls/loc/iconv/Universal_UCS_Conv converter is used to generate conversions from any code set X to code set Y by setting the proper links:

cd /usr/lib/nls/loc/iconv
ln -s /usr/lib/nls/loc/uconv/Universal_UCS_Conv X_Y
ln -s /usr/lib/nls/loc/uconv/UCSTBL X_UCS-2
ln-s /usr/lib/nls/loc/uconv/UCSTBL UCS-2_Y
ln -s /usr/lib/nls/loc/uconv/UCSTBL X
ln -s /usr/lib/nls/loc/uconv/UCSTBL Y

Converter Description
ISO8859-1 UCS-2 <--> ISO Latin-1
ISO8859-2 UCS-2 <--> ISO Latin-2
ISO8859-3 UCS-2 <--> ISO Latin-3
ISO8859-4 UCS-2 <--> ISO Latin-4
ISO8859-5 UCS-2 <--> ISO Cyrillic
ISO8859-6 UCS-2 <--> ISO Arabic
ISO8859-7 UCS-2 <--> ISO Greek
ISO8859-8 UCS-2 <--> ISO Hebrew
ISO8859-9 UCS-2 <--> ISO Turkish
JISX0201.1976-0 UCS-2 <--> Japanese JISX0201-0
JISX0208.1983-0 UCS-2 <--> Japanese JISX0208-0
CNS11643.1986-1 UCS-2 <--> Chinese CNS11643-1
CNS11643.1986-2 UCS-2 <--> Chinese CNS11643-2
KSC5601.1987-0 UCS-2 <--> Korean KSC5601-0
IBM-eucCN UCS-2 <--> Simplified Chinese EUC
IBM-udcCN UCS-2 <--> Simplified Chinese user-defined characters
IBM-sbdCN UCS-2 <--> Simplified Chinese IBM-specific characters
GB2312.1980-0 UCS-2 <--> Simplified Chinese GB
IBM-1381 UCS-2 <--> Simplified Chinese PC data code
IBM-935 UCS-2 <--> Simplified Chinese EBCDIC
IBM-936 UCS-2 <--> Simplified Chinese PC5550
IBM-eucJP UCS-2 <--> Japanese EUC
IBM-eucKR UCS-2 <--> Korean EUC
IBM-eucTW UCS-2 <--> Traditional Chinese EUC
IBM-udcJP UCS-2 <--> Japanese user-defined characters
IBM-udcTW UCS-2 <--> Traditional Chinese user-defined characters
IBM-sbdTW UCS-2 <--> Traditional Chinese IBM-specific characters
UTF-8 UCS-2 <--> UTF-8
IBM-437 UCS-2 <--> USA PC data code
IBM-850 UCS-2 <--> Latin-1 PC data code
IBM-852 UCS-2 <--> Latin-2 PC data code
IBM-857 UCS-2 <--> Turkish PC data code
IBM-860 UCS-2 <--> Portuguese PC data code
IBM-861 UCS-2 <--> Icelandic PC data code
IBM-863 UCS-2 <--> French Canadian PC data code
IBM-865 UCS-2 <--> Nordic PC data code
IBM-869 UCS-2 <--> Greek PC data code
IBM-921 UCS-2 <--> Baltic Multilingual data code
IBM-922 UCS-2 <--> Estonian data code
IBM-932 UCS-2 <--> Japanese PC data code
IBM-943 UCS-2 <--> Japanese PC data code
IBM-934 UCS-2 <--> Korea PC data code
IBM-936 UCS-2 <--> People's Republic of China PC data code
IBM-938 UCS-2 <--> Taiwanese PC data code
IBM-942 UCS-2 <--> Extended Japanese PC data code
IBM-944 UCS-2 <--> Korean PC data code
IBM-946 UCS-2 <--> People's Republic of China SAA data code
IBM-948 UCS-2 <--> Traditional Chinese PC data code
IBM-1124 UCS-2 <--> Ukranian PC data code
IBM-1129 UCS-2 <--> Vietnamese PC data code
TIS-620 UCS-2 <--> Thailand PC data code
IBM-037 UCS-2 <--> USA, Canada EBCDIC
IBM-273 UCS-2 <--> Germany, Austria EBCDIC
IBM-277 UCS-2 <--> Denmark, Norway EBCDIC
IBM-278 UCS-2 <--> Finland, Sweden EBCDIC
IBM-280 UCS-2 <--> Italy EBCDIC
IBM-284 UCS-2 <--> Spain, Latin America EBCDIC
IBM-285 UCS-2 <--> United Kingdom EBCDIC
IBM-297 UCS-2 <--> France EBCDIC
IBM-500 UCS-2 <--> International EBCDIC
IBM-875 UCS-2 <--> Greek EBCDIC
IBM-930 UCS-2 <--> Japanese Katakana-Kanji EBCDIC
IBM-933 UCS-2 <--> Korean EBCDIC
IBM-937 UCS-2 <--> Traditional Chinese EBCDIC
IBM-939 UCS-2 <--> Japanese Latin-Kanji EBCDIC
IBM-1026 UCS-2 <--> Turkish EBCDIC
IBM-1112 UCS-2 <--> Baltic Multilingual EBCDIC
IBM-1122 UCS-2 <--> Estonian EBCDIC
IBM-1124 UCS-2 <--> Ukranian EBCDIC
IBM-1129 UCS-2 <--> Vietnamese EBCDIC
GBK UCS-2<--> Simplified Chinese
TIS-620 UCS-2 <-->Thailand EBCDIC

List of UTF-8 Interchange Converters

UTF-8 is a universal, multibyte encoding described in the "UCS-2 and UTF-8" . Conversions for each code set are provided in both directions, between the code set and UTF-8.

UTF-8 converters are usually done by using the Universal_UCS_Conv (see "List of UCS-2 Interchange Converters" and /usr/lib/nls/loc/uconv/UTF-8 conversion.

Converter Description
ISO8859-1 UTF-8 <--> ISO Latin-1
ISO8859-2 UTF-8 <--> ISO Latin-2
ISO8859-3 UTF-8 <--> ISO Latin-3
ISO8859-4 UTF-8 <--> ISO Latin-4
ISO8859-5 UTF-8 <--> ISO Cyrillic
ISO8859-6 UTF-8 <--> ISO Arabic
ISO8859-7 UTF-8 <--> ISO Greek
ISO8859-8 UTF-8 <--> ISO Hebrew
ISO8859-9 UTF-8 <--> ISO Turkish
JISX0201.1976-0 UTF-8 <--> Japanese JISX0201-0
JISX0208.1983-0 UTF-8 <--> Japanese JISX0208-0
CNS11643.1986-1 UTF-8 <--> Chinese CNS11643-1
CNS11643.1986-2 UTF-8 <--> Chinese CNS11643-2
KSC5601.1987-0 UTF-8 <--> Korean KSC5601-0
IBM-eucCN UTF-8 <--> Simplified Chinese EUC
IBM-eucJP UTF-8 <--> Japanese EUC
IBM-eucKR UTF-8 <--> Korean EUC
IBM-eucTW UTF-8 <--> Traditional Chinese EUC
IBM-udcJP UTF-8 <--> Japanese user-defined characters
IBM-udcTW UTF-8 <--> Traditional Chinese user-defined characters
IBM-sbdTW UTF-8 <--> Traditional Chinese IBM-specific characters
UCS-2 UTF-8 <--> UCS-2
IBM-437 UTF-8 <--> USA PC data code
IBM-850 UTF-8 <--> Latin-1 PC data code
IBM-852 UTF-8 <--> Latin-2 PC data code
IBM-857 UTF-8 <--> Turkish PC data code
IBM-860 UTF-8 <--> Portuguese PC data code
IBM-861 UTF-8 <--> Icelandic PC data code
IBM-863 UTF-8 <--> French Canadian PC data code
IBM-865 UTF-8 <--> Nordic PC data code
IBM-869 UTF-8 <--> Greek PC data code
IBM-921 UTF-8 <--> Baltic Multilingual data code
IBM-922 UTF-8 <--> Estonian data code
IBM-932 UTF-8 <--> Japanese PC data code
IBM-943 UTF-8 <--> Japanese PC data code
IBM-934 UTF-8 <--> Korea PC data code
IBM-935 UTF-8 <--> Simplified Chinese EBCDIC
IBM-936 UTF-8 <--> People's Republic of China PC data code
IBM-938 UTF-8 <--> Taiwanese PC data code
IBM-942 UTF-8 <--> Extended Japanese PC data code
IBM-944 UTF-8 <--> Korean PC data code
IBM-946 UTF-8 <--> People's Republic of China SAA data code
IBM-948 UTF-8 <--> Traditional Chinese PC data code
IBM-1124 UTF-8 <--> Ukranian PC data code
IBM-1129 UTF-8 <--> Vietnamese PC data code
TIS-620 UTF-8 <--> Thailand PC data code
IBM-037 UTF-8 <--> USA, Canada EBCDIC
IBM-273 UTF-8 <--> Germany, Austria EBCDIC
IBM-277 UTF-8 <--> Denmark, Norway EBCDIC
IBM-278 UTF-8 <--> Finland, Sweden EBCDIC
IBM-280 UTF-8 <--> Italy EBCDIC
IBM-284 UTF-8 <--> Spain, Latin America EBCDIC
IBM-285 UTF-8 <--> United Kingdom EBCDIC
IBM-297 UTF-8 <--> France EBCDIC
IBM-500 UTF-8 <--> International EBCDIC
IBM-875 UTF-8 <--> Greek EBCDIC
IBM-930 UTF-8 <--> Japanese Katakana-Kanji EBCDIC
IBM-933 UTF-8 <--> Korean EBCDIC
IBM-937 UTF-8 <--> Traditional Chinese EBCDIC
IBM-939 UTF-8 <--> Japanese Latin-Kanji EBCDIC
IBM-1026 UTF-8 <--> Turkish EBCDIC
IBM-1112 UTF-8 <--> Baltic Multilingual EBCDIC
IBM-1122 UTF-8 <--> Estonian EBCDIC
IBM-1124 UTF-8 <--> Ukranian EBCDIC
IBM-1129 UTF-8 <--> Vietnamese EBCDIC
IBM-1381 UTF-8 <--> Simplified Chinese PC data code
GBK UTF-8<--> Simplified Chinese
TIS-620 UTF-8 <--> Thailand EBCDIC

List of Miscellaneous Converters

A set of low level converters used by the code set and interchange converters is provided. These converters are called miscellaneous converters. These low-level converters may be used by some of the interchange converters. However, the use of these converters is discouraged because they are intended for support of other converters.

Files

The following list describes the miscellaneous converters found in the /usr/lib/nls/loc/iconv and /usr/lib/nls/loc/iconvTable directories:

Converter Description
IBM-932_JISX0201.1976-0 IBM-932 to JISX0201.1976-0
IBM-932_JISX0208.1983-0 IBM-932 to JISX0208.1983-0
IBM-932_IBM-udcJP IBM-932 to IBM-udcJP (Japanese user-defined characters)
IBM-943_JISX0201.1976-0 IBM-943 to JISX0201.1976-0
IBM-943_JISX0208.1983-0 IBM-943 to JISX0208.1983-0
IBM-943_IBM-udcJP IBM-943 to IBM-udcJP (Japanese user-defined characters
IBM-eucJP_JISX0201.1976-0 IBM-eucJP to JISX0201.1976-0
IBM-eucJP_JISX0208.1983-0 IBM-eucJP to JISX0208.1983-0
IBM-eucJP_IBM-udcJP IBM-eucJP to IBM-udcJP (Japanese user-defined characters)
IBM-eucKR_KSC5601.1987-0 IBM_eucKR to KSC5601.1987-0
IBM-eucTW_CNS11643.1986-1 IBM-eucTW to CNS11643.1986.1
IBM-eucTW_CNS11643.1986-2 IBM-eucTW to CNS11643.1986-2
IBM-eucCN_GB2312.1980-0 IBM-eucCN to GB2312.1980-0

Related Information

National Language Support Overview for Programming, List of National Language Support Subroutines.

Code Sets Overview in AIX Kernel Extensions and Device Support Programming Concepts.

The iconv command, uuencode and uudecode commands.

The iconv_open subroutine, iconv subroutine, iconv_close subroutine.


[ Previous | Next | Contents | Glossary | Home | Search ]