Visustin options: Source encoding and Font script

Visustin supports a wide range of source file encodings and character sets. Visustin automatically loads ASCII, Unicode UTF-8, UTF-16 and UTF-32 encoded files, and also files saved with in Windows default codepage, without the need for any settings. In order to load other files, as well as files with foreign national characters, use the Source encoding or Font script settings.

General

Visustin supports the following kinds of source files:

Automatic support exists for Unicode (UTF-) encoded files and files saved in the system default Windows codepage. In order to load files files stored with another encoding, use the Source encoding setting, possibly followed by Font script. If some characters look incorrect, check the Font script setting. You can also view the Step-by-step instructions to display foreign characters.

Source encoding

Plain-text source code files can be saved in many character encodings. Typical encodings include ASCII, Unicode and Windows ANSI. By default, Visustin auto-detects the encoding. You can load most files without choosing any settings at all. Should you see odd characters, auto-detection may have failed, and you need to use the Source encoding setting.

The Source encoding setting lets you choose which text encoding (codepage) your source files are in. It lets you load DOS and Mac and IBM EBCDIC files into Visustin. You can also load files saved in a different codepage from that of your Windows, handy when working with foreign language programs. You can load code stored in any Windows supported codepage.

The Source encoding setting affects the loading of source code files, but not if you Copy & Paste your code to Visustin. The default is Automatic mode, which is appropriate in most cases. Other modes are DOS, Mac and Other, which lets you choose a specific encoding (codepage).

Source encoding options
Mode Codepage Usage
Automatic (default) Windows "ANSI" Automatic mode loads most source files. It auto-detects the encoding by examining file contents. First it checks if the file is plain ASCII or Unicode (UTF-8, UTF-16 or UTF-32). If the file declares its own encoding*), Visustin uses that one. Other files load with the Windows codepage that matches your current font script setting.
DOS OEM DOS mode loads code using the system default OEM codepage. It's the codepage you use with command line programs.**)
Mac Mac Mac mode loads code using the system default MacIntosh codepage, such as 10000 Mac Roman.
Other All Other allows you to select a specific file encoding (codepage). This setting is useful when the Automatic mode is not giving the correct results. As an example, you can load Cyrillic DOS code while running a West European Windows. Choose Other to load EBCDIC or UTF-7 code, or to force a Japanese codepage in USA.

*) Certain web languages, such as ColdFusion, MXML, PHP, Python, and XSLT, can define the codepage in the source file. Automatic mode works OK.
**) Certain DOS languages, such as QuickBASIC, are always in a DOS codepage. Both Automatic and DOS modes work equally well. Use Other with a specific DOS codepage if some characters are incorrect.

Unicode files (UTF-8, UTF-16, UTF-32) starting with signature bytes (BOM, byte order mark) are automatically loaded with the correct encoding regardless of the Source encoding setting. UTF-8 files without BOM are auto-detected in Automatic mode, but not in other modes.

ASCII files (pure 7-bit ASCII) can be loaded with any setting except for Other/EBCDIC.

Codepage summary:

Font script

Supported scripts
Western European
Arabic
Baltic
Central European
Chinese Simplified
Chinese Traditional
Cyrillic
Greek
Hebrew
Japanese
Korean
Thai
Turkish
Vietnamese
Others: limited support

The script of the selected font controls the display of national characters in Visustin code panel and Visustin Editor. Select a script via the Font setting in the Options menu. By default, Visustin uses the Windows default script. In the Americas and Western Europe, the default script is Western. You can choose any other. In the following picture, the Cyrillic script is selected for displaying code with Cyrillic characters.

Font options

Font script controls the following features:

Automatic script detection. If you load source code that does not display well with the current script, Visustin may temporarily override your font script setting. This helps you view code with, say, Greek symbols even if you normally work with Western characters only, and have your script set to Western. Automatic script detection works best with Unicode and UTF-8 encoded source files, or when Source encoding is set to Other. If script detection seems to have failed, try choosing another Font script.

Tip: If you have trouble displaying national characters, try the following fonts:

Limitations

Not all fonts can be used for Chinese, Japanese or Korean on some Windows systems. Use Arial Unicode MS with the correct script if you have problems.

Right-to-left scripts (Arabic and Hebrew) are partially implemented. Notably, line wrapping of mixed Latin and Arabic/Hebrew text may be incorrect.

All scripts are supported for PDF files, but some non-Latin characters (such as in Arabic and Thai) may display incorrectly.

Metafile creation may fail for characters not in the current ANSI codepage.

Characters beyond the Unicode Basic Multilingual Plane (BMP) are not supported. These characters are rarely used in source code.

Step-by-step instructions to display foreign characters

Visustin supports national characters automatically. With national characters we mean characters other than ASCII 0-127, such as Ä, Æ, þ, Ω, Ж, ش, 兒 or 하. The default automatic settings work when your code is in the same character set as your system. That means, the display is fine for Japanese characters on a Japanese Windows and Cyrillic characters on a Russian Windows.

When the code is in a foreign character set (different from your Windows setting), you need to set some options. The following instructions are for Japanese. They also work for Arabic, Baltic, Central European, Chinese, Cyrillic, Greek, Hebrew, Korean, Turkish and Vietnamese.

Step 1: Choose Font

  1. Set Options|Source encoding to Automatic (the default).
  2. Select Options|Font and choose an appropriate font and the correct script. In this case, select a font with a Japanese script.
  3. Load some code.

If the code doesn't display correctly, proceed to step 2.

Step 2: Specify Source encoding

  1. Select an appropriate codepage in Options|Source encoding|Other. For Japanese, try 932 ANSI/OEM Japanese Shift-JIS, unless you know the code is EBCDIC or some other codepage. For other than Japanese, look up the language name between pages 874 and 1258 and pick a line with the string ANSI.
  2. Verify the font script matches the Source encoding.
  3. Load some code.

Step 3: Change system locale

If you cannot get national characters to show up, your system may not be configured to display them. You need to change the system locale. If your code contains Japanese characters, try the Japanese locale. — To select another system locale in Windows 7, log in as Administrator, go to Control Panel, Regional Options, Administrative and set Language for non-Unicode programs.

See also

Options

©Aivosto Oy Visustin Help