Control characters in ASCII and Unicode

Tens of odd control characters appear in ASCII charts. The same characters have found their way to Unicode as well. CR, LF, ESC, CAN... what are all these codes for? Should I care about them? This is an in-depth look into control characters in ASCII and its descendants, including Unicode, ANSI and ISO standards.

When ASCII first appeared in the 1960s, control characters were an essential part of the new character set. Since then, many new character sets and standards have been published. Computing is not the same either. What happened to the control characters? Are they still used and if yes, for what?

This article looks back at the history of character sets while keeping an eye on modern use. The information is based on a number of standards released by ANSI, ISO, ECMA and The Unicode Consortium, as well as industry practice. In many cases, the standards define one use for a character, but common practice is different. Some characters are used contrary to the standards. In addition, certain characters were originally defined in an ambiguous or loose way, which has resulted in confusion in their use.

See also: Character sets

Contents

This article starts by looking at the history of control characters in standards. We then move to modern times. The rest of the article lists all the control characters in detail.

C0 NUL
00
SOH
01
STX
02
ETX
03
EOT
04
ENQ
05
ACK
06
BEL
07
BS
08
HT
09
LF
0A
VT
0B
FF
0C
CR
0D
SO
0E
SI
0F
DLE
10
DC1
11
DC2
12
DC3
13
DC4
14
NAK
15
SYN
16
ETB
17
CAN
18
EM
19
SUB
1A
ESC
1B
FS
1C
GS
1D
RS
1E
US
1F
SP
20
. . . . DEL
7F
C1 PAD
80
HOP
81
BPH
82
NBH
83
IND
84
NEL
85
SSA
86
ESA
87
HTS
88
HTJ
89
VTS
8A
PLD
8B
PLU
8C
RI
8D
SS2
8E
SS3
8F
DCS
90
PU1
91
PU2
92
STS
93
CCH
94
MW
95
SPA
96
EPA
97
SOS
98
SGCI
99
SCI
9A
CSI
9B
ST
9C
OSC
9D
PM
9E
APC
9F
8859 NBSP
A0
SHY
AD
Delim­iter Intro­ducer Shift Format effector Info sepa­rator Presen­tation control Graphic Area defi­nition Device control Trans­mission control Misc Not as­signed

The above clickable table summarizes the control characters. The character codes are given in hexadecimal. Color coding indicates character category. Click a character to jump to more information on it.

Groups of control characters

Control character areas

For the purposes of this document, control characters are divided into three groups.

1. ASCII control characters. The ASCII control character area covers code positions 0–31 (hex 00–1F). This area is also called the C0 set. Two additional controls appear at 32 and 127 (hex 20 and 7F). The ASCII control characters cover a wide range of uses, such as text layout, transmission and device control, and more. More

2. C1 control characters. C1 covers positions 128-159 (hex 80-9F). C1 is primarily for displays and printers. This set is related to ANSI escape sequences and VT100. More

3. ISO 8859 special characters. Two special characters, NBSP and SHY, are from ISO 8859. They are also used in Windows and Unicode. They appear at 160 and 173 (hex A0 and AD). More

Note: These control character sets are not the only control characters ever used. Other C0 and C1 sets do exist. Alternative sets were defined for special uses. In them, a part of the standard C0/C1 controls have been deleted or replaced by new controls. Even totally different alternative sets exist. Alternative control characters are not discussed in this article. One can find them in the International Register of Coded Character Sets.

Control characters in standards

ASCII control characters

C0 = positions 0–31. Origin with ASCII and ISO 646 character sets. Characters SP and DEL appear together with C0.

The first group of control characters originates from ASCII. These characters consist of a set called C0 and two additional characters. The C0 set is in locations 0 to 31. Two additional ASCII characters, SP and DEL, fall outside the C0 area, but they are closely related to the C0 set. All of these characters are defined by the same standards.

This set of control characters covers many uses. There are "Format Effectors" that control the appearance of plain text. There are "Transmission Controls" for use with transmission protocols and "Device Controls" to start, operate and stop auxiliary devices. There are "Information Separators" that delimit various pieces of data. Other controls exist for producing alerts, filling a media, indicating end of media, and for dealing with errors. There are even controls to create new characters and controls. The C0 set was defined with perforated tape, punched cards and typewriter-like devices in mind. Devices have changed since then, but the C0 controls have survived.

History of ASCII control characters

The first version of ASCII was released in 1963. Like the ASCII of today, the 1963 version covered some letters and symbols, as well as control characters. While many of those 35 control characters were similar to those of modern ASCII, some were different. ASCII-1963 had some serious shortcomings, such as no support for lower case letters. It quickly turned out that the standard must be revised. Today, ASCII-1963 is practically forgotten. Since ASCII-1963 deviates a lot from later ASCII versions in the control character area too, we will not go any deeper into it.

The next revision was ASCII-1965. This version, although formally accepted, was not published. Another revision was going to take place. ASCII as we know it is based on the ASCII-1967 standard (USAS X3.4-1967). This version was an important milestone. It was already very close to the version that then became widely used.

In 1968 ASCII was slightly updated and released as USAS X3.4-1968 (later retronamed as ANSI X3.4-1968). The actual updates were very small, only adding an option to use the character LF as a "newline", and designating ASCII and USASCII as the names of the standard. (Later on, the name USASCII was dropped, leaving ASCII as the official name.)

ASCII-1968 became immensely popular. Almost all of today's computer systems use ASCII or one of its descendants. (A notable exception is EBCDIC used on IBM mainframes, very different from ASCII.) The Internet is based on ASCII-1968 as well.

ASCII-1968 defined the 34 control characters that remained: the C0 set, SP and DEL. Included was a short description of the intended functionality of each control character. These definitions also made themselves to RFC 20 word for word. Most of these definitions have remained materially unchanged for decades. Later standards have updated the text, but the basic functionality is still the same. This is what comes to standards. Non-standard use is common and often contrary to the standards.

When ASCII emerged, computing equipment was quite different from the equipment that ASCII was going to be popularized with. Computers were regularly operated through punched cards, perforated tape and teletypewriters (TTYs). TTYs were typewriter-like devices, which were used as interactive computer terminals. Instead of a monitor they produced output on paper. The ASCII control characters were naturally designed considering the devices of those days. Since then, new devices such as monitors have emerged. It hasn't always been that simple to accommodate the control characters to the newer devices. Despite the challenges, the control characters of the 1960s are still with us.

ISO 646. ASCII evolved to an ISO standard, which is known as ISO 646. The first version came out in 1967. ISO 646 is the "international edition" of ASCII, with a few differences. Despite the differences, these standards were closely related. ISO 646 allowed national variants to support the national characters required for each country. The US national variant was ASCII. Several other national variants were released to support accented letters (à, ü and the like) and other symbols. The ISO variants including ASCII were a common way to express text in the 1970s and 1980s.

As to the control characters, the ASCII control characters set also appeared in ISO 646. The functionality of the control characters remained quite intact, even though the definitions were updated.

More standards. ISO 646 was also released as ECMA-6. The control characters appear in ECMA-6 very similar to those of ISO 646.

A part of the C0 codes were further refined in other standards. SI, SO and ESC appeared as character set extension controls in ANSI X3.41, ISO 2022 and ECMA-35. These characters became widely used to invoke additional character sets. The Transmission Control characters (T1 to T10) appeared as ISO 1745 in 1975, which gave detailed description of where and how they should be used. How widely ISO 1745 was actually used in transmission is another question.

Current status of ASCII control characters

ASCII was later updated in 1977 and again in 1986 to be in conformance with ISO 646. The control characters in ASCII-1986 and ISO 646/ECMA-6 are very similar, even though minor differences do exist.

The current ISO and ECMA versions, namely ISO 646:1991 and ECMA-6:1991, no longer define the C0 control characters. The control characters didn't go away, however. They now appear in ISO/IEC 6429:1992 and ECMA-48:1991, respectively. Simply put, the C0 set was lumped together with other control characters, the C1 set, which follows below.

As to some specific control characters, the current detailed definitions of SI, SO and ESC can be found in ANSI X3.41, ISO 2022 and ECMA-35. The current details for the Transmission Control characters (TC1 to TC10) appear in the old ISO 1745 from 1975.

Even though the history of the various standards related to the ASCII control codes may sound unnecessarily complicated, the standard functionality of the characters has not changed dramatically. It's still mostly the same as back in 1967. This is what comes to standards. The practice is totally different. Some control characters are indeed commonly used the standard way. On the other hand, many are used contrary to the standards, or simply ignored. It's not uncommon to find control characters forbidden in data. Control characters can have unwanted or unknown side-effects. The easiest way for programmers to deal with them is to shut their eyes or deny such characters altogether.

C1 control characters

C1 = positions 128–159. Primarily for displays and printers.

The C1 set appeared in the late 1970s. It is primarily designed for controlling display and printer devices, even though some of the controls warrant other uses as well. The C1 set is intended for use with the C0 set.

The C1 set includes "Format effectors" that control horizontal and vertical movement when displaying or printing. There are "Presentation controls" for defining line-break behavior. There are "Area definition" controls for form filling. There are "Introducers" and "Shift Functions" to support extra controls and characters. Additional controls exist for sending command strings and setting an indicator. Some of the controls were intended to cover for shortcomings in the C0 set. Some controls were reserved: 2 controls are for private use, while 4 controls were (and still are) reserved for future standardization.

The C1 set occupies positions 128–159 in 8-bit environments. There are also escape codes to use the C1 set on 7-bit systems. The respective escape codes (ESC char) are given in the C1 list further below.

History of C1

In 1979 ANSI released additional controls for use with ASCII (ANSI X3.64). This came to be known as the C1 set. A similar set was also released as ECMA-48. According to ANSI, the C1 controls were intended for input/output control of two-dimensional character-imaging devices, including interactive terminals of both the cathode ray tube and printer types, as well as output to microfilm printers.

A bit later, in 1983, the C1 set was standardized as ISO 6429. Standard-wise, the C1 set has been volatile. Both ISO 6429 and ECMA-48 were updated several times. New control characters were added and definitions updated. One of the C1 characters (IND) was eventually deprecated and removed.

The standards actually cover more control codes than those that fit in the C1 area. These additional controls are used via control sequences (escape sequences). The sequences are beyond the subject of this article. Let it suffice that the sequences are an important part of the standards that should be used together with the C1 controls. The sequences, together with C1, are also known as VT100 and ANSI escape sequences.

Current status of C1

The current standards for C1 are ISO/IEC 6429:1992 and ECMA-48:1991. These standards now define both the C0 and C1 control characters.

Unicode allows the use of C1 (and C0 too). In fact, the C1 area has been entirely reserved for control codes in Unicode. On the contrary, the (somewhat outdated) DOS and Windows codepages, i.e. character sets, have not reserved space for C1. Instead, they have included additional graphic characters in the C1 area. This doesn't prevent the use of C1 controls on DOS and Windows, though.

In practice, the C1 control characters are not very common. They are specialized codes for special applications.

ISO 8859 special characters NBSP and SHY

Positions 160 and 173.

ISO 8859 is a group of 8-bit extended character sets. The sets cover various Latin characters and also Cyrillic, Greek, Arabic, Hebrew and Thai characters. ISO 8859 is related to the Windows character sets ("ANSI codepages"), but these are actually different from each other.

Two characters in ISO 8859 are of interest to us: Non-Breaking Space (NBSP) and Soft Hyphen (SHY). They both have control character like properties, even though they are not actually called control characters in ISO 8859.

NBSP appears in position 160 (hex A0) and SHY is 173 (hex AD). The same positions, and roughly the same meanings too, have been adopted to many of the Windows codepages and Unicode.

Note: ISO 8859-8 Latin/Hebrew defines two additional special characters, namely LRM (left-to-right mark) and RLM (right-to-left mark). These characters are not universal in ISO 8859, but specific to Hebrew. Since LRM and RLM were not used in any other ISO 8859 character set, and since they do not appear in Unicode at the same positions, they are not further presented in this article.

Current status of NBSP and SHY

Several current standards include NBSP and SHY. They appear at the same positions in all of the following:

  • ISO 8859-1 to 8859-16.
    Exception: ISO 8859-11 Latin/Thai does not include SHY.
  • Windows codepages 1250–1258.
  • Unicode, block U+0080 C1 Controls and Latin Supplement.

Control characters in Unicode

Control characters have made their way to Unicode as well. Unicode recognizes control characters and explicitly allows their use. While Unicode doesn't obsolete control characters, it defines special rules for just a handful of them. Let the standard speak for itself:

The Unicode Standard provides for the intact interchange of these code points, neither adding to nor subtracting from their semantics. The semantics of the control codes are generally determined by the application with which they are used. However, in the absence of specific application uses, they may be interpreted according to the control function semantics specified in ISO/IEC 6429:1992. (Unicode 9.0 p. 822)

Unicode specifies semantics for the following control characters. The semantics appear to be in line with their original semantics, even though some differences may exist.

  1. ASCII control characters:
    • HT and SP are considered whitespace.
    • LF, VT, FF and CR are considered whitespace, and also mandatory line breaks in the line breaking algorithm.
    • FS, GS, RS and US are considered separators in the bi-directional algorithm.
  2. C1 control characters:
    • NEL is considered a mandatory line break in the line breaking algorithm, even though supporting it is optional.
  3. ISO 8859:
    • NBSP and SHY. These characters are not actually control characters in Unicode. Instead, NBSP is "Separator, space" and SHY is "Other, format". Both characters have features in the line-breaking algorithm. With SHY, Unicode is significantly more elaborate than ISO 8859 in that Unicode suggests more hyphenation features than just displaying a hyphen.

Note: While no new control characters appear in Unicode, it does define some of its own special characters, such as formatting characters. These characters are beyond the scope of this article.

From ASCII via ISO to Unicode

The following diagram summarizes the development of character standards. You can see how the control characters were propagated from ASCII (X3.4) and other standards to Unicode.

Diagram of standards (ASCII, ISO and Unicode) that define control characters

Control characters in modern applications

With so many control characters coming from the 1960s and 1970s, are they still useful for application programmers?

It depends on the application. Generally speaking, one needs control characters to work with old interfaces or devices. New protocols and file formats tend to use some other mechanism than control characters. Current formats typically use textual markup such as XML, which has little use for control characters beyond whitespace. On the device control side, unless you are writing device drivers, you control devices through operating system calls or library routines rather than sending them control strings to do tricks.

The following is a subjective list of which characters are still in common use and which ones are used less. The list is based on experience writing application software for Windows and DOS.

  • ASCII control characters: some used, some not
    • NUL is still common in everyday use. NUL terminates a string in many programming languages and interfaces.
    • Transmission control characters (T1 to T10) are generally of little use. Data transfer is done through TCP/IP sockets, HTTP, FTP or some other protocol. Individual transmission control characters appear for special uses.
    • BEL probably no longer appears in its original use. Rather than sending BEL to produce beeps, applications will rather play a tune via other means.
    • Format effectors (F0 to F5) are possibly the most important control characters these days. Some of them, such as CR and LF, are essential for a system to work at all. HT is also very common, especially in plain text files. BS and FF are less common. VT appears only rarely if ever.
    • Device control characters (DC1 to DC4) are not required to control devices, really. To control a device from an application you rather make a system call. On the other hand, you might still need XOFF (^S) or XON (^Q) in a command line session from time to time.
    • SO, SI and ESC used to be common, but this has changed. One may find them from time to time, but supposedly it's about older systems then.
    • CAN and EM are not in common use.
    • SUB might no longer appear as a substitute. You will more likely see something like "?" or the Unicode REPLACEMENT CHARACTER (U+FFFD) as a substitute for a bad character. Another use for SUB still exists, though. You could find it at the end of a text file.
    • Information separators (IS4 to IS1) are technically still valid. If anyone uses them to separate information is another question. Other techniques are used instead, such as XML or database systems. As a simple delimiter character a NUL, HT, CR/LF, comma or semicolon is more common than any of the information separators originally designed for the purpose.
    • SP must be the heaviest used control character of them all.
    • DEL – well, did you ever see one?
    • Characters ^A to ^Z (1 to 26) frequently appear as keyboard shortcuts in various applications and operating systems. The actual feature triggered by a keyboard shortcut is often unrelated to the respective control character. More of that follows below.
  • C1 control characters: little use
    • NEL is the only C1 character recognized by Unicode. The most probable case to run into NEL is when EBCDIC compatibility is required.
    • The other C1 characters appear outdated now. Since VT100 (that uses C1 extensively) is still a current method with Unix shell sessions, C1 is alive, maybe even everyday business for you. From a programmer's point of view the entire C1 set is rarely used.
  • ISO 8859 special characters: in use
    • NBSP is an everyday character to suppress a line break. It is supported by several current standards, including HTML and Unicode.
    • SHY seems to be less frequent.

Some frequently used characters, especially in a special field, may not have been mentioned. If you know frequent current uses for any of the characters, let us know.

Many of the control characters only appear rarely. How did this affect the space efficiency of 7-bit and 8-bit character sets? Instead of reserving space for control characters, it was possible to reuse these areas for additional graphics. This was actually done by DOS, Windows and Mac, all of which assigned graphic characters to the control character areas. Unicode chose to be different in this respect. Since its code space is much larger than 128 or 256, it was possible to reserve the C0/C1 areas entirely for control characters. This has helped the control characters to survive, if not in practical use, then at least in various code charts and lists.

Keyboards and control characters

Users can create many of the control characters from their keyboards. This usually happens in combination with the Ctrl key, and, more rarely, with the Esc key. There are also some special keys that produce control characters on their own. Backspace, Enter, Esc, Space and Tab are the usual ones.

Key presses and control characters, while having some things in common, are usually unrelated. Pressing a key combination doesn't generally trigger the functionality of the respective control character. As an example, while it's possible to press Ctrl+O to create an SO (Shift Out), pressing Ctrl+O seldom runs the operation associated with SO (pick an alternate character set). Instead, Ctrl+O might start an operation beginning with an O, such as "Open".

In some cases a key press does trigger the respective control character feature. Pressing the Tab key, or Ctrl+I, can indeed produce an HT (Horizontal Tabulation) and move the cursor forward on the line. This is an exception rather than the norm, though.

Some key combinations are more likely than others. Ctrl+A through Ctrl+Z (in other words, ^A to ^Z) are common keyboard shortcuts. Control key combinations with a symbol (^@, ^[, ^\, ^], ^^, ^_, ^?) are less common. There is a reason why such combinations should be avoided. Considerable variation exists with symbol keys in different keyboard layouts. A Ctrl and symbol key combination doesn't always produce the same control character, or any character at all, which makes it less useful as a keyboard shortcut.

In this article the focus is on the programmatic features of control characters. Less focus is put on the use of keyboard shortcuts.

About the character list

Next we are going to list every control character in detail. The column Dec refers to the decimal value of the control code ("ASCII value"). Hex is the same in hexadecimal, preceded by a dollar sign for clarity. An octal value is also given. The column Pos shows the row/column of the character in code charts.

The list shows key presses that (often) produce the control character on the keyboard. In addition, C-style escape sequences (\c) are provided where available, as are special constants supported by Visual Basic: classic version and Visual Basic .NET.

The last column lists mnemonics and graphic symbols. The symbols (in black) have been standardized, but they have fallen into disuse. The 2-letter mnemonics are standardized for the ASCII section. Additional 2-letter mnemonics for the C1 and ISO 8859 sections are taken from RFC 1345, which is not a standard, but is frequently referred to in this context.

C0 NUL
00
SOH
01
STX
02
ETX
03
EOT
04
ENQ
05
ACK
06
BEL
07
BS
08
HT
09
LF
0A
VT
0B
FF
0C
CR
0D
SO
0E
SI
0F
DLE
10
DC1
11
DC2
12
DC3
13
DC4
14
NAK
15
SYN
16
ETB
17
CAN
18
EM
19
SUB
1A
ESC
1B
FS
1C
GS
1D
RS
1E
US
1F
SP
20
. . . . DEL
7F
C1 PAD
80
HOP
81
BPH
82
NBH
83
IND
84
NEL
85
SSA
86
ESA
87
HTS
88
HTJ
89
VTS
8A
PLD
8B
PLU
8C
RI
8D
SS2
8E
SS3
8F
DCS
90
PU1
91
PU2
92
STS
93
CCH
94
MW
95
SPA
96
EPA
97
SOS
98
SGCI
99
SCI
9A
CSI
9B
ST
9C
OSC
9D
PM
9E
APC
9F
8859 NBSP
A0
SHY
AD
Delim­iter Intro­ducer Shift Format effector Info sepa­rator Presen­tation control Graphic Area defi­nition Device control Trans­mission control Misc Not as­signed

Character list

ASCII control characters (C0)

The ASCII control characters work in 7-bit and 8-bit environments, as well as in Unicode. These controls originate from a set of related standards: ASCII, ISO 646 and ECMA-6, and also ISO 6429 and ECMA-48. All of these characters are available in Unicode, too. The actual C0 set consists of characters NUL through US (0–31). Two additional characters, SP and DEL, are a part of ASCII and the related standards as well.

*) The 2-character mnemonics for the ASCII set are from ANSI X3.32, ISO 2047 and ECMA-17. So are also the graphic symbols. The symbols are outdated and rarely used. A couple of the symbols also have alternative forms.

Dec Hex Char Description Octal Pos *)
0 $00 NUL Null 000 0/0 NU
\0 ^@ NUL is defined in the standards as a filler character. It can be used as media-fill or time-fill. NUL doesn't affect the information content of a data stream. It may affect the information layout and the control of equipment, though. NUL
Note: NUL was originally intended as an ignorable filler character with no meaning. Especially convenient on paper tape, where a NUL equals no holes punched, it could be used to reserve space for new information or correcting errors. ASCII-1986 even suggests NUL as a "time-waster" character to be added after a newline to accommodate mechanical devices where a carriage return works slowly. Despite this, NUL has been used contrary to the standards in null-terminated strings as an End-Of-String marker. Several programming languages use this convention.
Constant in Visual Basic and VB.NET: vbNullChar, NullChar
1 $01 SOH Start of Heading — TC1 Transmission control character 1 001 0/1 SH
^A Indicates the beginning of a heading in a transmission. The heading can be terminated by STX. As per ASCII-1968, a heading constitutes a machine-sensible address or routing information. Later standards have dropped the explanation. SOH
Note: SOH, along with STX and ETX, was intended for data transmission. It is not intended for marking a heading in a document.
2 $02 STX Start of Text — TC2 Transmission control character 2 002 0/2 SX
^B STX has two functions in a transmission: it 1) indicates the beginning of a text and 2) may terminate a heading (see SOH). As per ASCII-1968, text is what should be transmitted to a destination. Later standards have dropped the explanation. STX
3 $03 ETX End of Text — TC3 Transmission control character 3 003 0/3 EX
^C Terminates a text in a transmission. As per ASCII-1968, a text starts with STX and ends with ETX. Later standards don't necessarily require the pairing of STX with ETX. ETX
Note: ETX may be used to call for reply from a slave station after a message has been sent. ETX is also commonly used to terminate an interactive process (Ctrl+C).
Ctrl+Break on PC keyboard produces this character code.
4 $04 EOT End of Transmission — TC4 Transmission control character 4 004 0/4 ET
^D Indicates the conclusion of a transmission. The transmission may have contained one or more texts and associated heading(s). EOT
Note: EOT can be used to end or abort a transmission. It can also be a reply to indicate inability to receive further messages. EOT (Ctrl+D) is even used as an End-Of-File control in a Unix shell session.
5 $05 ENQ Enquiry — TC5 Transmission control character 5 005 0/5 EQ
^E Requests a response from a remote station. The response may include station identification or status. ENQ can be used as a "Who Are You" (WRU) to identify a remote station, especially after a new connection has been established. ENQ
6 $06 ACK Acknowledge — TC6 Transmission control character 6 006 0/6 AK
^F An affirmative response. Transmitted from a receiver as a response to the sender. ACK
Note: ACK can indicate that a slave station has received a message correctly and is ready to receive more.
7 $07 BEL Bell 007 0/7 BL
\a ^G Calls for human attention. BEL may control alarm or attention devices. BEL
Note: BEL is the only control character with an audible effect. It has been used to ring a bell (indeed) or produce a beep sound. A visual alarm is also possible.
In Unicode, this control character is abbreviated BEL but named ALERT, while the name BELL is confusingly used for a graphic character (🔔).
8 $08 BS Backspace — FE0 Format effector 0 008 0/8 BS
\b ^H Moves one character position backwards (keeping the previous character). BS
Note: Contrary to the standards, BS has been used as a combined "move back and delete" operation to remove the previous character. This is not the standard meaning of BS, however. BS is defined as a non-destructive "move back" or "move left" operation, similar to a backspace in mechanical typewriters. To delete the previous character, BS should be followed by DEL. On paper tape the result would be the previous character being completely punched out (erased). BS followed by another character would strike two characters in the same position. Overstriking was a way to produce combined characters. This option was intended to internationalize ASCII. A letter followed by BS followed by a diacritic symbol would produce an accented letter. As an example, u BS ^ would produce û. Several ASCII characters (" ' ` ^ ~ ,) were indeed defined to be used as diacritic symbols. Overstriking could also be suitable with other characters, such as for underlining with the "_" character or printing a slash "/" over "=" to produce "not equal". It could even be used to achieve a strike-through effect (perhaps with -, / or X) to indicate removed text. A boldface effect could be achieved by striking the same character several times at the same position.
Overstriking was a useful option with printing devices, but displays hardly support it. With the advent of more capable character sets and formatting techniques overstriking can be considered outdated. ASCII-1986 does not require overstriking capabilities and suggests that overstriking may be proscribed in the future. ISO 8859 explicitly forbids overstriking.
←Backspace on PC keyboard produces this character code.
Constant in Visual Basic and VB.NET: vbBack, Back
See also: CCH
9 $09 HT Horizontal Tabulation — FE1 Format effector 1 (Character Tabulation) 011 0/9 HT
\t ^I Advances to the next pre-determined character position (horizontal tab stop). HT could also be used as a skip function on punched cards. HT
Note: HT is commonly also abbreviated TAB.
Even though the standards don't set a universal tab width, a typical fixed tab width is 8 columns. Other tab widths, as well as custom tab positions, are used as well. HT is a simple method of data compression: a single character can represent several spaces in formatted text.
The TAB key on the keyboard is consistent with HT in that it usually produces the code HT. How the HT is treated in each application is another story. In windowing environments, there are three common alternative uses. Pressing TAB can either add an HT character into text, indent text (possibly by adding an appropriate number of spaces or shifting the marginal), or something completely different: jump to the next field or control in a graphical user interface. This way the TAB key has been extended to cover more uses than what HT was originally intended for.
The original name of HT is Horizontal Tabulation. It was later renamed as HT Character Tabulation, first in ECMA-48:1986.
↹ Tab on PC keyboard produces this character code.
Constant in Visual Basic and VB.NET: vbTab, Tab
10 $0A LF Line Feed — FE2 Format effector 2 012 0/10 LF
^J LF has two alternative functions. It advances to the same character position on the next line (move down), or optionally to the first position on the next line (move to start of next line, i.e. newline). Originally LF was a move-down. A newline option (NL) was added soon. The option allowed LF to be used as a newline, which works like a combined CR LF. Use of LF as a newline requires agreement between sender and recipient of data. Universal agreement has not been reached. LF
Note: LF, having two alternative functions, has been a major source of confusion. While LF was initially defined as a "move down" operator, standards began to allow LF as a newline too. As a result, operating systems differ in their definition of a newline. A newline is LF on Unix. Operating systems using CR LF include CP/M, DOS, OS/2 and Windows. Naturally, this caused an incompatibility. To solve the problem, control characters IND and NEL were added to the C1 area. This did not solve the issue, resulting in IND being removed later. ECMA-6:1985 and ASCII-1986 attempted to clarify the situation by declaring LF deprecated for a newline and recommending CR LF instead. ECMA-48:1991 no longer allows LF to function as a newline.
The escape sequence for newline and LF is another source of confusion. \n is the common sequence for a newline, whereas there is no such a sequence for a line feed. The actual control character(s) represented by \n depend on the system. In some cases, \n indeed represents LF, but it can also represent another newline sequence.
Ctrl+Enter on PC keyboard produces this character code.
Constant in Visual Basic and VB.NET: vbLf, Lf
See also: CR IND NEL
11 $0B VT Vertical Tabulation — FE3 Format effector 3 (Line Tabulation) 013 0/11 VT
\v ^K Advances to the same character position on the next pre-determined line. ASCII-1977 and ASCII-1986 optionally allow VT to advance to the first position on the next pre-determined line, if agreed on. VT
Note: The original name of VT is Vertical Tabulation. It was later renamed as VT Line Tabulation, first in ECMA-48:1986. VT has been used to jump down to the next pre-defined line when printing on a paper form. According to some sources, vertical tab stops were typically spaced 6 lines apart. VT is a simple data compression method where a single VT represents several LF characters (and optionally a CR too).
In modern use VT must be quite a rare character. As Bob Bemer, one of the original designers of ASCII, put it: "This is a very dangerous character to use. It cannot be used directly on any terminal that I know of. Even if it could, the implementation rules are not supplied unambiguously in the ASCII standard."
Constant in Visual Basic and VB.NET: vbVerticalTab, VerticalTab
12 $0C FF Form Feed — FE4 Format effector 4 014 0/12 FF
\f ^L Advances to the next form or page. Standards differ in what column the subsequent character position will be in. Originally, ASCII-1968 did not define the column at all. ISO and ECMA standards declare that FF does not change the column. ASCII-1977 and ASCII-1986 optionally allow, by agreement, moving to the first column, as if FF was actually CF FF. FF
Note: FF has been used as "page break" in text files, "new page" on printers and "clear the screen" on displays. The situation was originally unclear whether FF was just a "new page" operator or "new page, move to column 1". ASCII-1977 and ECMA-6:1985 attempted to clarify the situation by recommending the use of CR FF. ASCII-1986 even implied that the "new page, move to column 1" option might be deleted in a future edition of ASCII.
Constant in Visual Basic and VB.NET: vbFormFeed, FormFeed
13 $0D CR Carriage Return — FE5 Format effector 5 015 0/13 CR
\r ^M Traditional definition: Moves to the first position on the same line (ASCII, ISO 646, ECMA-6). Newer definition: Moves to the line home position or line limit position of the same line (ISO 6429, ECMA-48). CR
Note: The standard meaning of CR is "move to beginning of current line". This allows overprinting the line with new characters, which could be used to achieve underlining, for example. For advancing to the next line CR would be followed by LF. On CP/M, DOS, OS/2 and Windows the newline marker is CR LF, which is according to the definition. CR alone has been used as the newline character on some systems, such as Commodore and Apple, which use does not conform to the standards in question. The order CR LF (instead of LF CR) may have been important on mechanical devices where a carriage return took relatively long to execute. A non-printing LF was more suitable output while the printing head was returning, rather than striking a graphic symbol in the middle of the line.
Enter on PC keyboard produces this character code.
Constant in Visual Basic and VB.NET: vbCr, Cr
See also: LF
14 $0E SO Shift Out — LS1 Locking-Shift One 016 0/14 SO
^N Used to extend the character set. SO may alter the meaning of the following bit combinations until an SI is reached. Between SI and SO, character positions 33-126 (decimal) may represent additional characters that would not otherwise fit in the regular character set. SO
Note: SO (Shift Out) is normal name of this control. LS1 (Locking-Shift One) is used by ECMA-35 and ECMA-48. In those standards, SO is used in 7-bit environments and LS1 in 8-bit environments. The mechanism to select the alternative character set(s) was defined in ANSI X3.41, ISO 2022 and ECMA-35. It includes the use of escape sequences starting with ESC. SO has also been used on printers to select enlarged characters or another color.
15 $0F SI Shift In — LS0 Locking-Shift Zero 017 0/15 SI
^O Used in conjunction with SO. It may reinstate the standard meanings of the characters following it. SI
Note: SI (Shift In) is normal name of this control. LS0 (Locking-Shift Zero) is used by ECMA-35 and ECMA-48. In those standards, SI is used in 7-bit environments and LS0 in 8-bit environments. SI has also been used on printers to select condensed characters or to reset color.
16 $10 DLE Data Link Escape — TC7 Transmission control character 7 020 1/0 DL
^P Used to provide supplementary data transmission control functions. DLE changes the meaning of a limited number of following characters. DLE
Note: DLE is the "escape" character for transmission control. DLE can potentially be put in the front of a transmission control character (TC1-TC10) to pass it through "as is" instead of controlling the current transmission. This is not always the case, though. It is possible to create new transmission control sequences with DLE in a similar way ESC is used to create escape sequences for other purposes. Contrary to the standards, Ctrl+P has been used as a keyboard command to echo console activity at the printer.
17 $11 DC1 Device Control 1 — XON 021 1/1 D1
^Q Intended to turn on or start an ancillary device, to restore it to the basic operation mode (see DC2 and DC3), or for any other device control function. DC1
Note: DC1 is conventionally called XON when used in communication for software flow control. The meaning of XON is to continue data transmission after an XOFF (DC3) has been received. The name XON ("transmit on") does not come from a standard, but it is commonly used.
18 $12 DC2 Device Control 2 022 1/2 D2
^R Intended for turning on or starting an ancillary device, set it to a special mode (restored via DC1), or for any other device control function. DC2
19 $13 DC3 Device Control 3 — XOFF 023 1/3 D3
^S Intended for turning off or stopping an ancillary device. It may be a secondary level stop such as wait, pause, stand-by or halt (restored via DC1). It can also perform any other device control function. DC3
Note: DC3 is conventionally called XOFF when used in communication for software flow control. An XOFF is issued to stop transmission when a device cannot accept more data. Transmission can be continued via XON (DC1). The name XOFF ("transmit off") does not come from a standard, but it is commonly used. The use of XOFF and XON is in line with the standards, even though not directly defined in them.
XOFF (^S) is sometimes used as a pause command. Continuing requires pressing XON (^Q). ^S even works as a pause on MS-DOS (pressing any key continues).
20 $14 DC4 Device Control 4 (Stop) 024 1/4 D4
^T Intended to turn off, stop or interrupt an ancillary device, or for any other device control function. DC4
21 $15 NAK Negative Acknowledge — TC8 Transmission control character 8 025 1/5 NK
^U Negative response. Transmitted from a receiver as a response to the sender. NAK
Note: NAK can be sent as a response to indicate inability to receive a message, or to request resending.
22 $16 SYN Synchronous Idle — TC9 Transmission control character 9 026 1/6 SY
^V Used as "time-fill" in synchronous transmission. Sent during an idle condition to retain a signal when there are no other characters to send. SYN
Note: SYN has been used by synchronous modems, which have to send data constantly. — Beginning each transmission with at least two SYN characters is a way to achieve synchronization. The receiving station will possibly ignore SYN, since it doesn't belong to the actual data content.
23 $17 ETB End of Transmission Block — TC10 Transmission control character 10 027 1/7 EB
^W Indicates the end of a block of data. Used when data is divided into blocks for transmission. ETB
Note: ETB, when used to end a block, may call for a reply from a slave station.
24 $18 CAN Cancel 030 1/8 CN
^X Indicates that data is in error or should be disregarded. Affects "the data with which it is sent" (ASCII-1968, ASCII-1977) or "the data preceding it" (ASCII-1986, ISO 646, ECMA-6, ECMA-48). CAN
Note: There are 2 alternative definitions for the data to be disregarded. The actual scope of cancellation is undefined by the standards and should be defined case by case. Ctrl+X has been used as a keyboard shortcut to cancel (delete) the characters on the current line, which use conforms to the standards.
25 $19 EM End of Medium 031 1/9 EM
^Y Identifies 1) the physical end of a medium, 2) the end of the used portion of a medium, or 3) the end of wanted data on a medium. EM
Note: EM may have been suitable for paper tape or magnetic tape to say "no more data". Disk file systems use more sophisticated ways to keep track of the used and unused areas of the medium.
This character is commonly abbreviated EM, except for Unicode, which provides it as an alias with abbreviation EOM.
26 $1A SUB Substitute 032 1/10 SB
^Z Used in place of an invalid or erroneous character. Introduced by automatic means in cases like a transmission error. SUB
Note: When SUB is used as a substitution character, the reverse question mark symbol seems quite good as its visual representation. Compare SUB to Unicode U+FFFD REPLACEMENT CHARACTER.
SUB has often been used contrary to the standards. On CP/M and MS-DOS, it appears as an End-Of-File marker for text files (^Z). On Unix, Ctrl+Z is a keyboard signal to interrupt a foreground process.
27 $1B ESC Escape 033 1/11 EC
\e ^[ The first character of an escape sequence. Provides either supplementary characters or additional control functions. ESC changes the meaning of a limited number of following characters. ESC
Note: ESC is used to form escape sequences, which perform various control functions or apply additional character sets. ESC can also be used to invoke the C1 control characters on a 7-bit system that only support character positions 0–127.
On the keyboard, sometimes the Esc key indeed produces the ESC control character. In windowing environments, the Esc key typically cancels a dialog or an operation, rather than producing a control character or starting an escape sequence. This kind of an "escape" is not based on the character standards, however. The closest ASCII equivalent for canceling a dialog would be CAN, but since there is no "Can" key on the common keyboards, it can't be used.
Esc on PC keyboard produces this character code.
28 $1C FS File Separator — IS4 Information separator 4 034 1/12 FS
^\ The four information separators (FS, GS, RS and US) are used to separate and qualify data. Each separator has two alternative names: Information Separator Four equals File Separator, Information Separator Three equals Group Separator, Information Separator Two equals Record Separator and Information Separator One equals Unit Separator. The separators can be used either hierarchically or in a non-hierarchical manner. When used hierarchically, the order is US (least inclusive), RS, GS and FS (most inclusive). The content and length of a file, group, record or unit are not specified by the standards. FS
FS, when used in a hierarchical order, delimits a data item called a file. It can also delimit anything else.
29 $1D GS Group Separator — IS3 Information separator 3 035 1/13 GS
^] GS, when used in a hierarchical order, delimits a data item called a group. It can also delimit anything else. GS
30 $1E RS Record Separator — IS2 Information separator 2 036 1/14 RS
^^ RS, when used in a hierarchical order, delimits a data item called a record. It can also delimit anything else. RS
31 $1F US Unit Separator — IS1 Information separator 1 037 1/15 US
^_ US, when used in a hierarchical order, delimits a data item called a unit. It can also delimit anything else. US
Note: The information separators were deliberately arranged next to SPACE, which can also be used as an information separator (word separator).
32 $20 SP Space 040 2/0 SP
Moves one character position forwards. Space may also have a function equivalent to that of an information separator. SP
Note: Space has a dual nature. It can be classified as both a control character and a (non-printing) graphic character. SP is similar to a Format Effector. It can also be used as a fifth Information Separator. Space is sometimes represented by the symbol ƀ or ␢ (b with a stroke) or ␣ (open box). SP does not belong to the C0 set.
Spacebar on PC keyboard produces this character code.
See also: NBSP
127 $7F DEL Delete 177 7/15 DT
^? Outdated. An ignorable character originally intended for erasing an erroneous or unwanted character in punched tape. In this standard use, DEL wouldn't affect the information content of data, even though it may have affected the information layout and the control of equipment. Standards also allowed DEL to be used as media-fill or time-fill (even though a NUL may be more appropriate). DEL
Note: DEL is now outdated. It was removed from the latest standards (ECMA-48 in 1991 and ISO 6429 in 1992). The origin of DEL is with perforated paper. On that, DEL was equal to "all holes punched", which is a way to invalidate an erroneous character (rubout). In a sense, DEL is similar to NUL, since both characters mean "nothing". ASCII-1977 suggests the use of DEL as a "time waster" to accommodate mechanical devices where a carriage return takes time to execute. ASCII-1986 recommends NUL as a time waster instead of DEL. DEL does not belong to the C0 set, but is an individual control code.
Ctrl+←Backspace on PC keyboard produces this character code.
See also: NUL

\x is what you write in a C program to produce the given control character. ^X means you press Ctrl+X to produce the given control character.

C1 control characters

The C1 control characters work in 8-bit environments. These controls come from 3 related standards: ANSI X3.64, ISO 6429 and ECMA-48. All of these characters are also available in Unicode, too. There are three unassigned control characters: PAD, HOP and SGCI. Use was planned for them in a failed draft DIS 10646, but they were not actually standardized or put to use. Despite this, one can find these control characters in various C1 lists online, and also as aliases in later Unicode standards.

†) The 2-character mnemonics for C1 are from RFC 1345. They are not standardized.

Dec Hex Char Description Octal Pos †)
128 $80 PAD unassigned, "Padding Character" 200 8/0 PA
ESC @ A reserved control code. Intended for use as PAD Padding Character in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
Note: Not part of ISO/IEC 6429 or ECMA-48.
Unicode lists this character as XXX and provides PAD as an alias.
129 $81 HOP unassigned, "High Octet Preset" 201 8/1 HO
ESC A A reserved control code. Intended for use as HOP High Octet Preset in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.
Unicode lists this character as XXX and provides HOP as an alias.
130 $82 BPH Break Permitted Here 202 8/2 BH
ESC B A point where a line break may occur.
Note: Roughly equivalent to a soft hyphen except that the means for indicating a line break is not necessarily a hyphen. Compare to Unicode U+200B ZERO WIDTH SPACE.
131 $83 NBH No Break Here 203 8/3 NH
ESC C A point where a line break may not occur.
Note: Compare to Unicode U+2060 WORD JOINER.
132 $84 IND Index 204 8/4 IN
ESC D Moves to the next line keeping the current horizontal position.
Note: According to ECMA-48:1986, IND was provided for use in those cases where LF was implemented as New Line. IND was deprecated in 1988 and withdrawn in 1992 from ISO/IEC 6429 (1986 and 1991 respectively for ECMA-48).
See also: LF RI
133 $85 NEL Next Line 205 8/5 NL
ESC E Moves to the first position of the next line. Alternatively, to line home or line limit position.
Note: NEL maps to the control character NL (New Line) in the EBCDIC character set used on IBM mainframes.
See also: LF
134 $86 SSA Start of Selected Area 206 8/6 SA
ESC F Starts a string of character positions whose contents can be transmitted. The string ends at EPA (or end of display).
135 $87 ESA End of Selected Area 207 8/7 ES
ESC G Ends a string of character positions (started by SPA) whose contents can be transmitted.
136 $88 HTS Horizontal Tabulation Set, Character Tabulation Set 210 8/8 HS
ESC H Sets a tab stop at the active position.
Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTS as Character Tabulation Set.
137 $89 HTJ Horizontal Tabulation with Justification, Character Tabulation with Justification 211 8/9 HJ
ESC I Moves text to the following tab stop. The text is what comes after the previous tab stop up to the active position.
Note: This character has several names. ANSI X3.64 originally called it Horizontal Tabulation with Justify. ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTJ as Character Tabulation with Justification.
138 $8A VTS Vertical Tabulation Set, Line Tabulation Set 212 8/10 VS
ESC J Sets a vertical tab stop at the active line.
Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed VTS as Line Tabulation Set.
139 $8B PLD Partial Line Down, Partial Line Forward 213 8/11 PD
ESC K Moves down so that following characters will appear as subscripts. Subscripts end at the next PLU.
Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLD as Partial Line Forward. Sample: text PLD subscript PLU text.
140 $8C PLU Partial Line Up, Partial Line Backward 214 8/12 PU
ESC L Moves up so that following characters will appear as superscripts. Superscripts end at the next PLD.
Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLU as Partial Line Backward. Sample: text PLU superscript PLD text.
141 $8D RI Reverse Index, Reverse Line Feed 215 8/13 RI
ESC M Moves to the previous line keeping the current horizontal position.
Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 renamed RI as Reverse Line Feed, apparently related to the removal of IND.
See also: IND
142 $8E SS2 Single Shift Two 216 8/14 S2
ESC N Used to extend the character set. The next character will be from the currently chosen G2 set.
Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.
143 $8F SS3 Single Shift Three 217 8/15 S3
ESC O Used to extend the character set. The next character will be from the currently chosen G3 set.
Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.
144 $90 DCS Device Control String 220 9/0 DC
ESC P Starts a device control string. ST ends the string. The control string may include commands to the receiving device, or a status report from the sending device.
145 $91 PU1 Private Use One 221 9/1 P1
ESC Q Reserved for private use, no standardized meaning.
146 $92 PU2 Private Use Two 222 9/2 P2
ESC R Reserved for private use, no standardized meaning.
147 $93 STS Set Transmit State 223 9/3 TS
ESC S Notifies that data is ready for transfer from a device (ANSI X3.64), or establishes the transmit state in the receiving device (ISO 6429, ECMA-48). Doesn't initiate the actual transmission.
148 $94 CCH Cancel Character 224 9/4 CC
ESC T Ignore the preceding graphic character (and CCH itself too). If the previous character is a control character or sequence, ANSI X3.64 says it should be ignored, while ISO 6429 and ECMA-48 leave the action undefined.
Note: Destructive backspace. Intended to eliminate ambiguity about the meaning of BS.
See also: BS
149 $95 MW Message Waiting 225 9/5 MW
ESC U Sets a message waiting indicator in the receiving device.
150 $96 SPA Start of Guarded Protected Area, Start of Protected Area, Start of Guarded Area 226 9/6 SG
ESC V Starts a string of character positions that can't be altered manually or transmitted. Optionally protects against erasure too. EPA will end the string.
Note: SPA is known as Start of Protected Area (ANSI X3.64, ECMA-48:1979), Start of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and Start of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).
151 $97 EPA End of Guarded Protected Area, End of Protected Area, End of Guarded Area 227 9/7 EG
ESC W Ends the area started by SPA.
Note: EPA is known as End of Protected Area (ANSI X3.64, ECMA-48:1979), End of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and End of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).
152 $98 SOS Start of String 230 9/8 SS
ESC X Starts a control string. The string ends at ST. It cannot contain a SOS. The interpretation of the string depends on the application.
153 $99 SGCI unassigned, "Single Graphic Character Introducer" 231 9/9 GC
ESC Y A reserved control code. Intended for use as SGCI Single Graphic Character Introducer in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.
Unicode lists this character as XXX and provides SGC as an alias.
154 $9A SCI Single Character Introducer 232 9/10 SC
ESC Z A reserved control code. The name was standardized as SCI Single Character Introducer, but the actual functionality was not implemented in the standards.
Note: SCI was to be followed by a single byte, which would represent a control function or a graphic character. The functions or characters were not defined in the standards.
155 $9B CSI Control Sequence Introducer 233 9/11 CI
ESC [ Starts a control sequence.
156 $9C ST String Terminator 234 9/12 ST
ESC \ Closes a string opened by APC, DCS, OSC, PM or SOS.
157 $9D OSC Operating System Command 235 9/13 OC
ESC ] Starts an operating system control string. The string ends at ST and is interpreted subject to the operating system.
158 $9E PM Privacy Message 236 9/14 PM
ESC ^ Starts a privacy message. ST will end the message.
159 $9F APC Application Program Command 237 9/15 AC
ESC _ Starts an application program command string. ST will end the command. The interpretation of the command is subject to the program in question.

ESC X means you press Esc followed by X to produce this control character.

ISO 8859 special characters

The two special characters, NBSP and SHY, are not really control characters. They are graphic characters with a special feature. The characters also appear in Unicode. They are included here for the sake of completeness.

‡) The 2-character mnemonics for NBSP and SHY are from RFC 1345. They are not standardized.

Dec Hex Char Description Octal Pos ‡)
160 $A0 NBSP No-Break Space 240 10/0 NS
A space for use when a line break is to be prevented.
Note: NBSP can sometimes be produced by pressing Ctrl+Shift+SPACE. No universally supported key combination exists.
In HTML you can write   or   to add a no-break space to a web page.
See also: SP
173 $AD SHY Soft Hyphen 255 10/13 --
Indicates an intraword break point for use when a word must be broken across lines. The visual rendering either is a hyphen (ISO 8859) or varies (Unicode).
Note: SHY can sometimes be produced by pressing Ctrl+-. No universally supported key combination exists.
In HTML you can write ­ or ­ to add a soft hyphen to a web page.
C0 NUL
00
SOH
01
STX
02
ETX
03
EOT
04
ENQ
05
ACK
06
BEL
07
BS
08
HT
09
LF
0A
VT
0B
FF
0C
CR
0D
SO
0E
SI
0F
DLE
10
DC1
11
DC2
12
DC3
13
DC4
14
NAK
15
SYN
16
ETB
17
CAN
18
EM
19
SUB
1A
ESC
1B
FS
1C
GS
1D
RS
1E
US
1F
SP
20
. . . . DEL
7F
C1 PAD
80
HOP
81
BPH
82
NBH
83
IND
84
NEL
85
SSA
86
ESA
87
HTS
88
HTJ
89
VTS
8A
PLD
8B
PLU
8C
RI
8D
SS2
8E
SS3
8F
DCS
90
PU1
91
PU2
92
STS
93
CCH
94
MW
95
SPA
96
EPA
97
SOS
98
SGCI
99
SCI
9A
CSI
9B
ST
9C
OSC
9D
PM
9E
APC
9F
8859 NBSP
A0
SHY
AD
Delim­iter Intro­ducer Shift Format effector Info sepa­rator Presen­tation control Graphic Area defi­nition Device control Trans­mission control Misc Not as­signed

Categories

A summary of character categories. Mostly based on ANSI X3.64, ISO 6429, ECMA-6 and ECMA-48.

Delimiters (Control string delimiters)
Delimiters start and end a control string. A control string consists of an opening delimiter, a command string or a character string, and a terminating delimiter (ST).
APC DCS OSC PM SOS ST
Introducers
An introducer is a control character or escape sequence that begins a sequence. The sequence is interpreted as a single graphic character or control.
CSI ESC SCI
Shift function characters (was: Code extension control)
Shift function characters are used to extend the character set of the code. They may alter the meaning of one or more characters that follow them.
SI SO SS2 SS3
Format effectors (also: Layout characters)
Format effectors are mainly intended for the control of the layout and positioning of information. Format effectors (most of them) are data which happen to have a format representation rather than a graphic representation.
BS CR FF HT HTJ HTS IND LF NEL PLD PLU RI VT VTS
Information separators
Information separators separate and qualify data logically. They may be used either in hierarchical order or non-hierarchically. Their specific meanings depend on the application.
FS GS RS US
Presentation control characters
Presentation control characters indicate where a line break may or may not occur.
BPH NBH
Graphic characters
Graphic characters appearing here are those that have control character like properties.
NBSP SHY SP
Area definition characters (was: Form filling)
Area definition characters are used for entering information into a preformatted visual display.
EPA ESA SPA SSA
Device control characters
Device control characters are intended for the control of local or remote or ancillary devices. They are not intended to control data communication systems; this should be done with transmission control characters.
DC1 DC2 DC3 DC4
Transmission control characters (was: Communication control)
Transmission control characters are intended to control or facilitate transmission of information over telecommunication networks.
ACK DLE ENQ EOT ETB ETX NAK SOH STX SYN
Miscellaneous
Miscellaneous control characters fall outside other categories.
BEL CAN CCH DEL EM MW NUL PU1 PU2 STS SUB
Not as­signed
Unassigned control characters are ones that were not standardized. Their location was reserved for future standardization. These characters are known by names that appeared in a draft (DIS 10646), even though they didn't make it to the final standard.
PAD HOP SGCI

Translations

The translated terms are taken from the given standards. Several alternative translations may exist.

English French Russian Spanish German Finnish
Standard Unicode 5.0 GOST 34.301‑91, GOST 34.302.2‑91 T.53, T.51, T.50 DIN 66003 SFS 4017 *
NUL Null NUL nul ПУС пусто nulo Füll­zeichen tyhjä­merkki
SOH Start of Heading DET début d'en-tête НЗ начало заголовка comienzo de encabe­za­miento Anfang des Kopfes otsikon alku
STX Start of Text DTX début de texte НТ начало текста comienzo de texto Anfang des Textes tekstin alku
ETX End of Text FTX fin de texte КТ конец текста fin de texto Ende des Textes tekstin loppu
EOT End of Transmission FTR fin de transmission КП конец передачи fin de transmisión Ende der Über­tragung tekstin loppu
ENQ Enquiry DEM demande КТМ кто там? pregunta Stations­auf­forderung kysely
ACK Acknowledge ACC accusé de réception [positif] ДА под­твер­жде­ние acuse de recibo Positive Rück­meldung kuit­taus
BEL Bell SON sonnerie ЗВ звонок timbre Klingel ääni­merkki
BS Backspace EFF espace arrière ВШ возврат на шаг retroceso Rück­wärts­schritt peruutus
HT Horizontal Tabulation TAB tabulation horizontale ГТ го­ри­зон­таль­ная табуляция tabulación de caracteres Horizontal-Tabulator sarake­ohjaus
LF Line Feed PAL changement de ligne ПС перевод строки cambio de renglón Zeilen­vorschub rivi­askel
VT Vertical Tabulation TAV tabulation verticale ВТ вер­ти­каль­ная табуляция tabulación vertical Vertikal-Tabulator rivitys
FF Form Feed SDP saut de page, page suivante ПФ перевод формата página siguiente Formular­vorschub sivun vaihto
CR Carriage Return RC retour de chariot ВК возврат каретки retorno del carro Wagen­rücklauf vaunun palautus
SO Shift Out HC hors code ВЫХ выход cambio-salida Dauerum­schaltung koodin­vaihto
SI Shift In EC en code ВХ вход cambio-entrada Rück­schaltung koodin­palautus
DLE Data Link Escape ÉCT échappement transmission AP1 авторегистр один escape de enlace de datos Daten­über­tragungs­um­schaltung ohjaus­koodin poik­keus
DC1 Device Control 1 CD1 commande d'appareil un СУ1 символ устройства один control de dispositivo uno Geräte­steuerung 1 laitteen ohjaus 1
DC2 Device Control 2 CD2 commande d'appareil deux СУ2 символ устройства два control de dispositivo dos Geräte­steuerung 2 laitteen ohjaus 2
DC3 Device Control 3 CD3 commande d'appareil trois СУ3 символ устройства три control de dispositivo tres Geräte­steuerung 3 laitteen ohjaus 3
DC4 Device Control 4 (Stop) CD4 commande d'appareil quatre СУ4 символ устройства четыре control de dispositivo cuatro Geräte­steuerung 4 laitteen ohjaus 4
NAK Negative Acknowledge ACN accusé de réception négatif НЕТ отрицание acuse de recibo negativo Negative Rück­meldung kiel­teinen kuittaus
SYN Synchronous Idle SYN synchronisation СИН синхронизация reposo síncrono Synchroni­sierung tahditus
ETB End of Transmission Block FBT fin de bloc de transmission КБ конец блока fin de bloque de transmisión Ende des Über­tragungs­blocks jakson­siirron loppu
CAN Cancel ANN annulation ОТМ отмена cancelar Ungültig sanoman peruutus
EM End of Medium FS fin de support КН конец носителя fin del medio físico Ende der Auf­zeichnung tieto­välineen loppu
SUB Substitute SUB substitution ЗС замена символа substituto Substi­tution korvike
ESC Escape ÉCH échappement АР2 авторегистр два escape Umschaltung koodin poikkeus
FS File Separator SF séparateur de fichiers РФ разделитель файлов separador de fichero Haupt­gruppen-Trennung tiedoston erotus­merkki
GS Group Separator SG séparateur de groupes РГ разделитель групп separador de grupo Gruppen-Trennung ryhmän erotus­merkki
RS Record Separator SA séparateur d'enre­gis­tre­ments, séparateur d'articles РЗ разделитель записей separador de registro Unter­gruppen-Trennung tietueiden erotus­merkki
US Unit Separator SSA séparateur de sous-articles РЭ разделитель элементов separador de unidad Teil­gruppen-Trennung yksikön erotus­merkki
SP Space ESP espace ПР пробел espacio Zwischen­raum tyhjä
DEL Delete SUP suppression ЗБ забой suprimir Löschen merkin poisto
PAD "Padding Character" caractère de bourre
HOP "High Octet Preset" octet supérieur prédéfini
BPH Break Permitted Here API arrêt permis ici РПС разрешение переноса строки corte permitido aquí
NBH No Break Here PAI aucun arrêt ici ЗПС запрет переноса строки corte no permitido aquí
IND Index IND index ИНД индекс
NEL Next Line NL à la ligne НС новая строка
SSA Start of Selected Area DZS début de zone sélectionnée НВО начало выбранной области
ESA End of Selected Area FZS fin de zone sélectionnée КВО конец выбранной области
HTS Horizontal Tabulation Set TTH taquet de tabulateur horizontal УГТ установка го­ри­зон­таль­ной табуляции
HTJ Horizontal Tabulation with Justification THJ tabulateur horizontal avec justification ГТВ го­ри­зон­таль­ная табуляция с выключкой
VTS Vertical Tabulation Set TTV taquet de tabulateur vertical УВТ установка вертикальной табуляции
PLD Partial Line Down IPav interligne partiel avant CCB смещение строки вперед avance de línea parcial
PLU Partial Line Up IPar interligne partiel arrière CCH смещение строки назад retroceso de línea parcial
RI Reverse Index IR index renversé, interligne inversé ОПС обратный перевод строки cambio de renglón inverso
SS2 Single Shift Two RU2 rem­pla­ce­ment unique deux ПЕ2 пере­клю­ча­тель единичный два cambio individual dos
SS3 Single Shift Three RU3 rem­pla­ce­ment unique trois ПЕ3 пере­клю­ча­тель единичный три cambio individual tres
DCS Device Control String CCA chaîne de commande d'appareils УЦУ управляющая цепочка устройства
PU1 Private Use One UP1 usage privé un ЧИ1 частное исполь­зо­ва­ние один
PU2 Private Use Two UP2 usage privé deux ЧИ2 частное исполь­зо­ва­ние два
STS Set Transmit State MMT mise en mode transmission УСП установка состояния передачи
CCH Cancel Character ANC annulation du caractère précédent OTC отмена символа
MW Message Waiting MES ATT message en attente ОС ожидание сообщения
SPA Start of Guarded Protected Area DZP début de zone protégée НСО начало сохраняемой области
EPA End of Guarded Protected Area FZP fin de zone protégée КСО конец сохраняемой области
SOS Start of String DC début de chaîne НЦ начало цепочки comienzo de cadena
SGCI "Single Graphic Character Introducer" introducteur de caractère graphique unique
SCI Single Character Introducer ICU introducteur de caractère unique ГЕС головной символ единичного символа
CSI Control Sequence Introducer ISC introducteur de séquence de commandes ГУП головной символ управляющей после­до­ва­тель­ности introductor de secuencia de control
ST String Terminator FC fin de chaîne ТРЦ терминатор цепочки terminador de cadena
OSC Operating System Command CSE commande de système d'exploitation КОС команда операционной системы
PM Privacy Message MP message privé ЧС частное сообщение
APC Application Program Command CO PRO commande de progiciel КПП команда прикладной программы
NBSP No-Break Space ESP INS espace insécable непре­ры­ва­ю­щий пробел espacio anticorte yhdis­tävä väli­lyönti *
SHY Soft Hyphen CDN trait d'union conditionnel гибкий дефис guión de corte programable pehmeä tavu­viiva *

* Finnish terms marked with an asterisk are not from any standard, but from recommendation Eurooppalaisen merkistön merkkien suomenkieliset nimet.

Character index

ACK Acknowledge
APC Application Program Command
BEL Bell
BPH Break Permitted Here
BS Backspace
CAN Cancel
CCH Cancel Character
CR Carriage Return
CSI Control Sequence Introducer
DC1 Device Control 1
DC2 Device Control 2
DC3 Device Control 3
DC4 Device Control 4 (Stop)
DCS Device Control String
DEL Delete
DLE Data Link Escape
EM End of Medium
ENQ Enquiry
EOT End of Transmission
EPA End of Guarded Protected Area
ESA End of Selected Area
ESC Escape
ETB End of Transmission Block
ETX End of Text
FE0 Format effector 0 (Backspace)
FE1 Format effector 1 (Character Tabulation)
FE2 Format effector 2 (Line Feed)
FE3 Format effector 3 (Line Tabulation)
FE4 Format effector 4 (Form Feed)
FE5 Format effector 5 (Carriage Return)
FF Form Feed
FS File Separator
GS Group Separator
HOP "High Octet Preset" (unassigned)
HT Horizontal Tabulation
HTJ Horizontal Tabulation with Justification
HTS Horizontal Tabulation Set
IND Index
IS1 Information separator 1 (Unit Separator)
IS2 Information separator 2 (Record Separator)
IS3 Information separator 3 (Group Separator)
IS4 Information separator 4 (File Separator)
LF Line Feed
LS0 Locking-Shift Zero (Shift In)
LS1 Locking-Shift One (Shift Out)
MW Message Waiting
NAK Negative Acknowledge
NBH No Break Here
NBSP No-Break Space
NEL Next Line
NUL Null
OSC Operating System Command
PAD "Padding Character" (unassigned)
PLD Partial Line Down
PLU Partial Line Up
PM Privacy Message
PU1 Private Use One
PU2 Private Use Two
RI Reverse Index
RS Record Separator
SCI Single Character Introducer
SGCI "Single Graphic Character Introducer" (unassigned)
SHY Soft Hyphen
SI Shift In
SO Shift Out
SOH Start of Heading
SOS Start of String
SP Space
SPA Start of Guarded Protected Area
SS2 Single Shift Two
SS3 Single Shift Three
SSA Start of Selected Area
ST String Terminator
STS Set Transmit State
STX Start of Text
SUB Substitute
SYN Synchronous Idle
TC1 Transmission control character 1 (Start of Heading)
TC2 Transmission control character 2 (Start of Text)
TC3 Transmission control character 3 (End of Text)
TC4 Transmission control character 4 (End of Transmission)
TC5 Transmission control character 5 (Enquiry)
TC6 Transmission control character 6 (Acknowledge)
TC7 Transmission control character 7 (Data Link Escape)
TC8 Transmission control character 8 (Negative Acknowledge)
TC9 Transmission control character 9 (Synchronous Idle)
TC10 Transmission control character 10 (End of Transmission Block)
US Unit Separator
VT Vertical Tabulation
VTS Vertical Tabulation Set
XOFF Device Control 3
XON Device Control 1

Sources

Special thanks for help to Douglas A. Kerr, the principal author and editor of the published standards document of the first complete version of ASCII.

Last updated in August 2016: Unicode 9.0, CP/M, additional details on PAD, HOP and SGCI.

Control characters in ASCII and Unicode
URN:NBN:fi-fe201109235583

©Aivosto Oy -