Supported Encodings
The table below lists the encodings that the Content component supports for text processing and indexing. Not all encodings are valid for all supported languages. For a list of the most common supported encodings for each language, see Supported Languages and Common Encodings.
| Content component name | XML name | ISO name |
|---|---|---|
| ARABIC | Windows-1256 | CP_1256 |
| ARABIC_ISO | ISO-8859-6 | ISO8859-6 |
| ARABIC_MAC | x-mac-arabic | CP_10004 |
| ASCII | ISO-8859-1 | CP_ACP |
| ASCII_IBM | IBM850 | CP_850 |
| CHINESESIMPLIFIED | GBK | CP_936 |
| CHINESETRADITIONAL | Big5 | CP_950 |
| CYRILLIC | Windows-1251 | CP_1251 |
| CYRILLIC_DOS | IBM866 | CP_866 |
| CYRILLIC_ISO | ISO-8859-5 | ISO8859-5 |
| CYRILLIC_KOI8 | KOI8-R | CP_21866 |
| EASTERNEUROPEAN | Windows-1250 | CP_1250 |
| EASTERNEUROPEAN_ISO | ISO-8859-2 | ISO8859-2 |
| EUC | EUC-JP | |
| GREEK | Windows-1253 | CP_1253 |
| GREEK_ISO | ISO-8859-7 | ISO8859-7 |
| HEBREW | Windows-1255 | CP_1255 |
| HEBREW_ISO | ISO-8859-8 | ISO8859-8 |
| JIS | JIS_Encoding | |
| KOREAN | KS_C_5601-1987 | CP_949 |
| LATIN3 | ISO-8859-3 | ISO8859-3 |
| LATIN5 | ISO-8859-9 | ISO8859-9 |
| LATIN6 | ISO-8859-14 | ISO8859-14 |
| LATIN7 | ISO-8859-13 | ISO8859-13 |
| LATIN9 | ISO-8859-15 | ISO8859-15 |
| NORTHERNEUROPEAN | Windows-1257 | CP_1257 |
| NORTHERNEUROPEAN_ISO | ISO-8859-4 | ISO8859-4 |
| SHIFTJIS | Shift_JIS | CP_932 |
| THAI | TIS-620 | CP_874 |
| TURKISH | Windows-1254 | CP_1254 |
| UCS2 | ISO-10646-UCS-2 | ISO-10646 |
| UTF8 | UTF-8 | CP_UTF8 |
| VIETNAMESE | Windows-1258 | CP_1258 |
| WESTERNEUROPEAN | Windows-1252 | CP_1252 |