Control characters in ASCII and Unicode

Tens of odd control characters appear in ASCII charts. The same characters have found their way to Unicode as well. CR, LF, ESC, CAN... what are all these codes for? Should I care about them? This is an in-depth look into control characters in ASCII and its descendants, including Unicode, ANSI and ISO standards.

When ASCII first appeared in the 1960s, control characters were an essential part of the new character set. Since then, many new character sets and standards have been published. Computing is not the same either. What happened to the control characters? Are they still used and if yes, for what?

This article looks back at the history of character sets while keeping an eye on modern use. The information is based on a number of standards released by ANSI, ISO, ECMA and The Unicode Consortium, as well as industry practice. In many cases, the standards define one use for a character, but common practice is different. Some characters are used contrary to the standards. In addition, certain characters were originally defined in an ambiguous or loose way, which has resulted in confusion in their use.

Groups of control characters

Control character areas. First area 7-bit ASCII: C0 followed by SP, graphic characters and DEL. Second area 8-bit: C1 followed by NBSP, graphic characters and SHY

For the purposes of this document, control characters are divided into three groups.

1. ASCII control characters. The ASCII control character area covers code positions 0–31 (hex 00–1F). This area is also called the C0 set. Two additional controls appear at 32 and 127 (hex 20 and 7F). The ASCII control characters cover a wide range of uses, such as text layout, transmission and device control, and more. More

2. C1 control characters. C1 covers positions 128-159 (hex 80-9F). C1 is primarily for displays and printers. This set is related to ANSI escape sequences and VT100. More

3. ISO 8859 special characters. Two special characters, NBSP and SHY, are from ISO 8859. They are also used in Windows and Unicode. They appear at 160 and 173 (hex A0 and AD). More

Note: These control character sets are not the only control characters ever used. Other C0 and C1 sets do exist. Alternative sets were defined for special uses. In them, a part of the standard C0/C1 controls have been deleted or replaced by new controls. Even totally different alternative sets exist. Alternative control characters are not discussed in this article. One can find them in the International Register of Coded Character Sets.

Control characters in standards

ASCII control characters

C0 = positions 0–31. Origin with ASCII and ISO 646 character sets. Characters SP and DEL appear together with C0.

The first group of control characters originates from ASCII. These characters consist of a set called C0 and two additional characters. The C0 set is in locations 0 to 31. Two additional ASCII characters, SP and DEL, fall outside the C0 area, but they are closely related to the C0 set. All of these characters are defined by the same standards.

This set of control characters covers many uses. There are "Format Effectors" that control the appearance of plain text. There are "Transmission Controls" for use with transmission protocols and "Device Controls" to start, operate and stop auxiliary devices. There are "Information Separators" that delimit various pieces of data. Other controls exist for producing alerts, filling a media, indicating end of media, and for dealing with errors. There are even controls to create new characters and controls. The C0 set was defined with perforated tape, punched cards and typewriter-like devices in mind. Devices have changed since then, but the C0 controls have survived.

History of ASCII control characters

The first version of ASCII was released in 1963. Like the ASCII of today, the 1963 version covered some letters and symbols, as well as control characters. While many of those 35 control characters were similar to those of modern ASCII, some were different. ASCII-1963 had some serious shortcomings, such as no support for lower case letters. It quickly turned out that the standard must be revised. Today, ASCII-1963 is practically forgotten. Since ASCII-1963 deviates a lot from later ASCII versions in the control character area too, we will not go any deeper into it.

The next revision was ASCII-1965. This version, although formally accepted, was not published. Another revision was going to take place. ASCII as we know it is based on the ASCII-1967 standard (USAS X3.4-1967). This version was an important milestone. It was already very close to the version that then became widely used.

In 1968 ASCII was slightly updated and released as USAS X3.4-1968 (later retronamed as ANSI X3.4-1968). The actual updates were very small, only adding an option to use the character LF as a "newline", and designating ASCII and USASCII as the names of the standard. (Later on, the name USASCII was dropped, leaving ASCII as the official name.)

ASCII-1968 became immensely popular. Almost all of today's computer systems use ASCII or one of its descendants. (A notable exception is EBCDIC used on IBM mainframes, very different from ASCII.) The Internet is based on ASCII-1968 as well.

ASCII-1968 defined the 34 control characters that remained: the C0 set, SP and DEL. Included was a short description of the intended functionality of each control character. These definitions also made themselves to RFC 20 word for word. Most of these definitions have remained materially unchanged for decades. Later standards have updated the text, but the basic functionality is still the same. This is what comes to standards. Non-standard use is common and often contrary to the standards.

When ASCII emerged, computing equipment was quite different from the equipment that ASCII was going to be popularized with. Computers were regularly operated through punched cards, perforated tape and teletypewriters (TTYs). TTYs were typewriter-like devices, which were used as interactive computer terminals. Instead of a monitor they produced output on paper. The ASCII control characters were naturally designed considering the devices of those days. Since then, new devices such as monitors have emerged. It hasn't always been that simple to accommodate the control characters to the newer devices. Despite the challenges, the control characters of the 1960s are still with us.

ISO 646. ASCII evolved to an ISO standard, which is known as ISO 646. The first version came out in 1967. ISO 646 is the "international edition" of ASCII, with a few differences. Despite the differences, these standards were closely related. ISO 646 allowed national variants to support the national characters required for each country. The US national variant was ASCII. Several other national variants were released to support accented letters (à, ü and the like) and other symbols. The ISO variants including ASCII were a common way to express text in the 1970s and 1980s.

As to the control characters, the ASCII control characters set also appeared in ISO 646. The functionality of the control characters remained quite intact, even though the definitions were updated.

More standards. ISO 646 was also released as ECMA-6. The control characters appear in ECMA-6 very similar to those of ISO 646.

A part of the C0 codes were further refined in other standards. SI, SO and ESC appeared as character set extension controls in ANSI X3.41, ISO 2022 and ECMA-35. These characters became widely used to invoke additional character sets. The Transmission Control characters (T₁ to T₁₀) appeared as ISO 1745 in 1975, which gave detailed description of where and how they should be used. How widely ISO 1745 was actually used in transmission is another question.

Current status of ASCII control characters

ASCII was later updated in 1977 and again in 1986 to be in conformance with ISO 646. The control characters in ASCII-1986 and ISO 646/ECMA-6 are very similar, even though minor differences do exist.

The current ISO and ECMA versions, namely ISO 646:1991 and ECMA-6:1991, no longer define the C0 control characters. The control characters didn't go away, however. They now appear in ISO/IEC 6429:1992 and ECMA-48:1991, respectively. Simply put, the C0 set was lumped together with other control characters, the C1 set, which follows below.

As to some specific control characters, the current detailed definitions of SI, SO and ESC can be found in ANSI X3.41, ISO 2022 and ECMA-35. The current details for the Transmission Control characters (TC₁ to TC₁₀) appear in the old ISO 1745 from 1975.

Even though the history of the various standards related to the ASCII control codes may sound unnecessarily complicated, the standard functionality of the characters has not changed dramatically. It's still mostly the same as back in 1967. This is what comes to standards. The practice is totally different. Some control characters are indeed commonly used the standard way. On the other hand, many are used contrary to the standards, or simply ignored. It's not uncommon to find control characters forbidden in data. Control characters can have unwanted or unknown side-effects. The easiest way for programmers to deal with them is to shut their eyes or deny such characters altogether.

C1 control characters

C1 = positions 128–159. Primarily for displays and printers.

The C1 set appeared in the late 1970s. It is primarily designed for controlling display and printer devices, even though some of the controls warrant other uses as well. The C1 set is intended for use with the C0 set.

The C1 set includes "Format effectors" that control horizontal and vertical movement when displaying or printing. There are "Presentation controls" for defining line-break behavior. There are "Area definition" controls for form filling. There are "Introducers" and "Shift Functions" to support extra controls and characters. Additional controls exist for sending command strings and setting an indicator. Some of the controls were intended to cover for shortcomings in the C0 set. Some controls were reserved: 2 controls are for private use, while 4 controls were (and still are) reserved for future standardization.

The C1 set occupies positions 128–159 in 8-bit environments. There are also escape codes to use the C1 set on 7-bit systems. The respective escape codes (ESC char) are given in the C1 list further below.

History of C1

In 1979 ANSI released additional controls for use with ASCII (ANSI X3.64). This came to be known as the C1 set. A similar set was also released as ECMA-48. According to ANSI, the C1 controls were intended for input/output control of two-dimensional character-imaging devices, including interactive terminals of both the cathode ray tube and printer types, as well as output to microfilm printers.

A bit later, in 1983, the C1 set was standardized as ISO 6429. Standard-wise, the C1 set has been volatile. Both ISO 6429 and ECMA-48 were updated several times. New control characters were added and definitions updated. One of the C1 characters (IND) was eventually deprecated and removed.

The standards actually cover more control codes than those that fit in the C1 area. These additional controls are used via control sequences (escape sequences). The sequences are beyond the subject of this article. Let it suffice that the sequences are an important part of the standards that should be used together with the C1 controls. The sequences, together with C1, are also known as VT100 and ANSI escape sequences.

Current status of C1

The current standards for C1 are ISO/IEC 6429:1992 and ECMA-48:1991. These standards now define both the C0 and C1 control characters.

Unicode allows the use of C1 (and C0 too). In fact, the C1 area has been entirely reserved for control codes in Unicode. On the contrary, the (somewhat outdated) DOS and Windows codepages, i.e. character sets, have not reserved space for C1. Instead, they have included additional graphic characters in the C1 area. This doesn't prevent the use of C1 controls on DOS and Windows, though.

In practice, the C1 control characters are not very common. They are specialized codes for special applications.

ISO 8859 special characters NBSP and SHY

Positions 160 and 173.

ISO 8859 is a group of 8-bit extended character sets. The sets cover various Latin characters and also Cyrillic, Greek, Arabic, Hebrew and Thai characters. ISO 8859 is related to the Windows character sets ("ANSI codepages"), but these are actually different from each other.

Two characters in ISO 8859 are of interest to us: Non-Breaking Space (NBSP) and Soft Hyphen (SHY). They both have control character like properties, even though they are not actually called control characters in ISO 8859.

NBSP appears in position 160 (hex A0) and SHY is 173 (hex AD). The same positions, and roughly the same meanings too, have been adopted to many of the Windows codepages and Unicode.

Note: ISO 8859-8 Latin/Hebrew defines two additional special characters, namely LRM (left-to-right mark) and RLM (right-to-left mark). These characters are not universal in ISO 8859, but specific to Hebrew. Since LRM and RLM were not used in any other ISO 8859 character set, and since they do not appear in Unicode at the same positions, they are not further presented in this article.

Current status of NBSP and SHY

Several current standards include NBSP and SHY. They appear at the same positions in all of the following:

ISO 8859-1 to 8859-16.
Exception: ISO 8859-11 Latin/Thai does not include SHY.
Windows codepages 1250–1258.
Unicode, block U+0080 C1 Controls and Latin Supplement.

Control characters in Unicode

Control characters have made their way to Unicode as well. Unicode recognizes control characters and explicitly allows their use. While Unicode doesn't obsolete control characters, it defines special rules for just a handful of them. Let the standard speak for itself:

The Unicode Standard provides for the intact interchange of these code points, neither adding to nor subtracting from their semantics. The semantics of the control codes are generally determined by the application with which they are used. However, in the absence of specific application uses, they may be interpreted according to the control function semantics specified in ISO/IEC 6429:1992. (Unicode 9.0 p. 822)

Unicode specifies semantics for the following control characters. The semantics appear to be in line with their original semantics, even though some differences may exist.

ASCII control characters:
- HT and SP are considered whitespace.
- LF, VT, FF and CR are considered whitespace, and also mandatory line breaks in the line breaking algorithm.
- FS, GS, RS and US are considered separators in the bi-directional algorithm.
C1 control characters:
- NEL is considered a mandatory line break in the line breaking algorithm, even though supporting it is optional.
ISO 8859:
- NBSP and SHY. These characters are not actually control characters in Unicode. Instead, NBSP is "Separator, space" and SHY is "Other, format". Both characters have features in the line-breaking algorithm. With SHY, Unicode is significantly more elaborate than ISO 8859 in that Unicode suggests more hyphenation features than just displaying a hyphen.

Note: While no new control characters appear in Unicode, it does define some of its own special characters, such as formatting characters. These characters are beyond the scope of this article.

From ASCII via ISO to Unicode

The following diagram summarizes the development of character standards. You can see how the control characters were propagated from ASCII (X3.4) and other standards to Unicode.

Diagram of standards (ASCII, ISO and Unicode) that define control characters

Control characters in modern applications

With so many control characters coming from the 1960s and 1970s, are they still useful for application programmers?

It depends on the application. Generally speaking, one needs control characters to work with old interfaces or devices. New protocols and file formats tend to use some other mechanism than control characters. Current formats typically use textual markup such as XML, which has little use for control characters beyond whitespace. On the device control side, unless you are writing device drivers, you control devices through operating system calls or library routines rather than sending them control strings to do tricks.

The following is a subjective list of which characters are still in common use and which ones are used less. The list is based on experience writing application software for Windows and DOS.

ASCII control characters: some used, some not
- NUL is still common in everyday use. NUL terminates a string in many programming languages and interfaces.
- Transmission control characters (T₁ to T₁₀) are generally of little use. Data transfer is done through TCP/IP sockets, HTTP, FTP or some other protocol. Individual transmission control characters appear for special uses.
- BEL probably no longer appears in its original use. Rather than sending BEL to produce beeps, applications will rather play a tune via other means.
- Format effectors (F₀ to F₅) are possibly the most important control characters these days. Some of them, such as CR and LF, are essential for a system to work at all. HT is also very common, especially in plain text files. BS and FF are less common. VT appears only rarely if ever.
- Device control characters (DC₁ to DC₄) are not required to control devices, really. To control a device from an application you rather make a system call. On the other hand, you might still need XOFF (^S) or XON (^Q) in a command line session from time to time.
- SO, SI and ESC used to be common, but this has changed. One may find them from time to time, but supposedly it's about older systems then.
- CAN and EM are not in common use.
- SUB might no longer appear as a substitute. You will more likely see something like "?" or the Unicode REPLACEMENT CHARACTER (U+FFFD) as a substitute for a bad character. Another use for SUB still exists, though. You could find it at the end of a text file.
- Information separators (IS₄ to IS₁) are technically still valid. If anyone uses them to separate information is another question. Other techniques are used instead, such as XML or database systems. As a simple delimiter character a NUL, HT, CR/LF, comma or semicolon is more common than any of the information separators originally designed for the purpose.
- SP must be the heaviest used control character of them all.
- DEL – well, did you ever see one?
- Characters ^A to ^Z (1 to 26) frequently appear as keyboard shortcuts in various applications and operating systems. The actual feature triggered by a keyboard shortcut is often unrelated to the respective control character. More of that follows below.
C1 control characters: little use
- NEL is the only C1 character recognized by Unicode. The most probable case to run into NEL is when EBCDIC compatibility is required.
- The other C1 characters appear outdated now. Since VT100 (that uses C1 extensively) is still a current method with Unix shell sessions, C1 is alive, maybe even everyday business for you. From a programmer's point of view the entire C1 set is rarely used.
ISO 8859 special characters: in use
- NBSP is an everyday character to suppress a line break. It is supported by several current standards, including HTML and Unicode.
- SHY seems to be less frequent.

Some frequently used characters, especially in a special field, may not have been mentioned. If you know frequent current uses for any of the characters, let us know.

Many of the control characters only appear rarely. How did this affect the space efficiency of 7-bit and 8-bit character sets? Instead of reserving space for control characters, it was possible to reuse these areas for additional graphics. This was actually done by DOS, Windows and Mac, all of which assigned graphic characters to the control character areas. Unicode chose to be different in this respect. Since its code space is much larger than 128 or 256, it was possible to reserve the C0/C1 areas entirely for control characters. This has helped the control characters to survive, if not in practical use, then at least in various code charts and lists.

Keyboards and control characters

Users can create many of the control characters from their keyboards. This usually happens in combination with the Ctrl key, and, more rarely, with the Esc key. There are also some special keys that produce control characters on their own. ←Backspace, Enter, Esc, Space and ↹ Tab are the usual ones.

Key presses and control characters, while having some things in common, are usually unrelated. Pressing a key combination doesn't generally trigger the functionality of the respective control character. As an example, while it's possible to press Ctrl+O to create an SO (Shift Out), pressing the keys seldom runs the operation associated with SO (pick an alternate character set). Instead, Ctrl+O might start an operation beginning with an O, such as "Open".

In some cases a key press does trigger the respective control character feature. Pressing the ↹ Tab key, or Ctrl+I, can indeed produce an HT (Horizontal Tabulation) and move the cursor forward on the line. This is an exception rather than the norm, though.

Some key combinations are more likely than others. Ctrl+A through Ctrl+Z (in other words, ^A to ^Z) are common keyboard shortcuts. Control key combinations with a symbol (^@, ^[, ^\, ^], ^^, ^_, ^?) are less common. There is a reason why such combinations should be avoided. Considerable variation exists with symbol keys in different keyboard layouts. A Ctrl and symbol key combination doesn't always produce the same control character, or any character at all, which makes it less useful as a keyboard shortcut.

In this article the focus is on the programmatic features of control characters. Less focus is put on the use of keyboard shortcuts.

About the character list

Next we are going to list every control character in detail. The column Dec refers to the decimal value of the control code ("ASCII value"). Hex is the same in hexadecimal, preceded by a dollar sign for clarity. An octal value is also given. The column Pos shows the row/column of the character in code charts.

The list shows key presses that (often) produce the control character on the keyboard. In addition, C-style escape sequences (\c) are provided where available, as are special constants supported by Visual Basic: classic version and Visual Basic .NET.

The last column lists mnemonics and graphic symbols. The symbols (in black) have been standardized, but they have fallen into disuse. The 2-letter mnemonics are standardized for the ASCII section. Additional 2-letter mnemonics for the C1 and ISO 8859 sections are taken from RFC 1345, which is not a standard, but is frequently referred to in this context.

NUL
00

SOH
01

STX
02

ETX
03

EOT
04

ENQ
05

ACK
06

BEL
07

BS
08

HT
09

LF
0A

VT
0B

FF
0C

CR
0D

SO
0E

SI
0F

DLE
10

DC1
11

DC2
12

DC3
13

DC4
14

NAK
15

SYN
16

ETB
17

CAN
18

EM
19

SUB
1A

ESC
1B

FS
1C

GS
1D

RS
1E

US
1F

SP
20

. . . .

DEL
7F

PAD
80

HOP
81

BPH
82

NBH
83

IND
84

NEL
85

SSA
86

ESA
87

HTS
88

HTJ
89

VTS
8A

PLD
8B

PLU
8C

RI
8D

SS2
8E

SS3
8F

DCS
90

PU1
91

PU2
92

STS
93

CCH
94

MW
95

SPA
96

EPA
97

SOS
98

SGCI
99

SCI
9A

CSI
9B

ST
9C

OSC
9D

PM
9E

APC
9F

8859

NBSP
A0

SHY
AD

Delimiter

Introducer

Shift

Format effector

Info separator

Presentation control

Graphic

Area definition

Device control

Transmission control

Misc

Not assigned

Character list

ASCII control characters (C0)

The ASCII control characters work in 7-bit and 8-bit environments, as well as in Unicode. These controls originate from a set of related standards: ASCII, ISO 646 and ECMA-6, and also ISO 6429 and ECMA-48. All of these characters are available in Unicode, too. The actual C0 set consists of characters NUL through US (0–31). Two additional characters, SP and DEL, are a part of ASCII and the related standards as well.

*) The 2-character mnemonics for the ASCII set are from ANSI X3.32, ISO 2047 and ECMA-17. So are also the graphic symbols. The symbols are outdated and rarely used. A couple of the symbols also have alternative forms.

Dec	Hex	Char	Description	Octal	Pos	*)
0	$00	NUL	Null	000	0/0	NU
	`\0`	`^@`	NUL is defined in the standards as a filler character. It can be used as media-fill or time-fill. NUL doesn't affect the information content of a data stream. It may affect the information layout and the control of equipment, though.
			Note: NUL was originally intended as an ignorable filler character with no meaning. Especially convenient on paper tape, where a NUL equals no holes punched, it could be used to reserve space for new information or correcting errors. ASCII-1986 even suggests NUL as a "time-waster" character to be added after a newline to accommodate mechanical devices where a carriage return works slowly. Despite this, NUL has been used contrary to the standards in null-terminated strings as an End-Of-String marker. Several programming languages use this convention.
			Constant in Visual Basic and VB.NET: `vbNullChar, NullChar`
1	$01	SOH	Start of Heading — TC₁ Transmission control character 1	001	0/1	SH
		`^A`	Indicates the beginning of a heading in a transmission. The heading can be terminated by STX. As per ASCII-1968, a heading constitutes a machine-sensible address or routing information. Later standards have dropped the explanation.
			Note: SOH, along with STX and ETX, was intended for data transmission. It is not intended for marking a heading in a document.
2	$02	STX	Start of Text — TC₂ Transmission control character 2	002	0/2	SX
		`^B`	STX has two functions in a transmission: it 1) indicates the beginning of a text and 2) may terminate a heading (see SOH). As per ASCII-1968, text is what should be transmitted to a destination. Later standards have dropped the explanation.
3	$03	ETX	End of Text — TC₃ Transmission control character 3	003	0/3	EX
		`^C`	Terminates a text in a transmission. As per ASCII-1968, a text starts with STX and ends with ETX. Later standards don't necessarily require the pairing of STX with ETX.
			Note: ETX may be used to call for reply from a slave station after a message has been sent. ETX is also commonly used to terminate an interactive process (keyboard: `Ctrl+C`).
			`Ctrl+Break` on PC keyboard produces this character code.
4	$04	EOT	End of Transmission — TC₄ Transmission control character 4	004	0/4	ET
		`^D`	Indicates the conclusion of a transmission. The transmission may have contained one or more texts and associated heading(s).
			Note: EOT can be used to end or abort a transmission. It can also be a reply to indicate inability to receive further messages. EOT (keyboard: `Ctrl+D`) is even used as an End-Of-File control in a Unix shell session.
5	$05	ENQ	Enquiry — TC₅ Transmission control character 5	005	0/5	EQ
		`^E`	Requests a response from a remote station. The response may include station identification or status. ENQ can be used as a "Who Are You" (WRU) to identify a remote station, especially after a new connection has been established.
6	$06	ACK	Acknowledge — TC₆ Transmission control character 6	006	0/6	AK
		`^F`	An affirmative response. Transmitted from a receiver as a response to the sender.
			Note: ACK can indicate that a slave station has received a message correctly and is ready to receive more.
7	$07	BEL	Bell	007	0/7	BL
	`\a`	`^G`	Calls for human attention. BEL may control alarm or attention devices.
			Note: BEL is the only control character with an audible effect. It has been used to ring a bell (indeed) or produce a beep sound. A visual alarm is also possible.
			In Unicode, this control character is abbreviated BEL but named ALERT, while the name BELL is confusingly used for a graphic character (🔔).
8	$08	BS	Backspace — FE₀ Format effector 0	008	0/8	BS
	`\b`	`^H`	Moves one character position backwards (keeping the previous character).
			Note: Contrary to the standards, BS has been used as a combined "move back and delete" operation to remove the previous character. This is not the standard meaning of BS, however. BS is defined as a non-destructive "move back" or "move left" operation, similar to a backspace in mechanical typewriters. To delete the previous character, BS should be followed by DEL. On paper tape the result would be the previous character being completely punched out (erased). BS followed by another character would strike two characters in the same position. Overstriking was a way to produce combined characters. This option was intended to internationalize ASCII. A letter followed by BS followed by a diacritic symbol would produce an accented letter. As an example, u BS ^ would produce û. Several ASCII characters (" ' ` ^ ~ ,) were indeed defined to be used as diacritic symbols. Overstriking could also be suitable with other characters, such as for underlining with the "_" character or printing a slash "/" over "=" to produce "not equal". It could even be used to achieve a strike-through effect (perhaps with -, / or X) to indicate removed text. A boldface effect could be achieved by striking the same character several times at the same position.
			Overstriking was a useful option with printing devices, but displays hardly support it. With the advent of more capable character sets and formatting techniques overstriking can be considered outdated. ASCII-1986 does not require overstriking capabilities and suggests that overstriking may be proscribed in the future. ISO 8859 explicitly forbids overstriking.
			`←Backspace` on PC keyboard produces this character code.
			Constant in Visual Basic and VB.NET: `vbBack, Back`
			See also: CCH
9	$09	HT	Horizontal Tabulation — FE₁ Format effector 1 (Character Tabulation)	011	0/9	HT
	`\t`	`^I`	Advances to the next pre-determined character position (horizontal tab stop). HT could also be used as a skip function on punched cards.
			Note: HT is commonly also abbreviated TAB.
			Even though the standards don't set a universal tab width, a typical fixed tab width is 8 columns. Other tab widths, as well as custom tab positions, are used as well. HT is a simple method of data compression: a single character can represent several spaces in formatted text.
			The `↹ Tab` key on the keyboard is consistent with HT in that it usually produces the code HT. How the HT is treated in each application is another story. In windowing environments, there are three common alternative uses. Pressing `↹ Tab` can either add an HT character into text, indent text (possibly by adding an appropriate number of spaces or shifting the marginal), or something completely different: jump to the next field or control in a graphical user interface. This way the key has been extended to cover more uses than what HT was originally intended for.
			The original name of HT is Horizontal Tabulation. It was later renamed as HT Character Tabulation, first in ECMA-48:1986.
			`↹ Tab` on PC keyboard produces this character code.
			Constant in Visual Basic and VB.NET: `vbTab, Tab`
10	$0A	LF	Line Feed — FE₂ Format effector 2	012	0/10	LF
		`^J`	LF has two alternative functions. It advances to the same character position on the next line (move down), or optionally to the first position on the next line (move to start of next line, i.e. newline). Originally LF was a move-down. A newline option (NL) was added soon. The option allowed LF to be used as a newline, which works like a combined CR LF. Use of LF as a newline requires agreement between sender and recipient of data. Universal agreement has not been reached.
			Note: LF, having two alternative functions, has been a major source of confusion. While LF was initially defined as a "move down" operator, standards began to allow LF as a newline too. As a result, operating systems differ in their definition of a newline. A newline is LF on Unix. Operating systems using CR LF include CP/M, DOS, OS/2 and Windows. Naturally, this caused an incompatibility. To solve the problem, control characters IND and NEL were added to the C1 area. This did not solve the issue, resulting in IND being removed later. ECMA-6:1985 and ASCII-1986 attempted to clarify the situation by declaring LF deprecated for a newline and recommending CR LF instead. ECMA-48:1991 no longer allows LF to function as a newline.
			The escape sequence for newline and LF is another source of confusion. \n is the common sequence for a newline, whereas there is no such a sequence for a line feed. The actual control character(s) represented by \n depend on the system. In some cases, \n indeed represents LF, but it can also represent another newline sequence.
			`Ctrl+Enter` on PC keyboard produces this character code.
			Constant in Visual Basic and VB.NET: `vbLf, Lf`
			See also: CR IND NEL
11	$0B	VT	Vertical Tabulation — FE₃ Format effector 3 (Line Tabulation)	013	0/11	VT
	`\v`	`^K`	Advances to the same character position on the next pre-determined line. ASCII-1977 and ASCII-1986 optionally allow VT to advance to the first position on the next pre-determined line, if agreed on.
			Note: The original name of VT is Vertical Tabulation. It was later renamed as VT Line Tabulation, first in ECMA-48:1986. VT has been used to jump down to the next pre-defined line when printing on a paper form. According to some sources, vertical tab stops were typically spaced 6 lines apart. VT is a simple data compression method where a single VT represents several LF characters (and optionally a CR too).
			In modern use VT must be quite a rare character. As Bob Bemer, one of the original designers of ASCII, put it: "This is a very dangerous character to use. It cannot be used directly on any terminal that I know of. Even if it could, the implementation rules are not supplied unambiguously in the ASCII standard."
			Constant in Visual Basic and VB.NET: `vbVerticalTab, VerticalTab`
12	$0C	FF	Form Feed — FE₄ Format effector 4	014	0/12	FF
	`\f`	`^L`	Advances to the next form or page. Standards differ in what column the subsequent character position will be in. Originally, ASCII-1968 did not define the column at all. ISO and ECMA standards declare that FF does not change the column. ASCII-1977 and ASCII-1986 optionally allow, by agreement, moving to the first column, as if FF was actually CR FF.
			Note: FF has been used as "page break" in text files, "new page" on printers and "clear the screen" on displays. The situation was originally unclear whether FF was just a "new page" operator or "new page, move to column 1". ASCII-1977 and ECMA-6:1985 attempted to clarify the situation by recommending the use of CR FF. ASCII-1986 even implied that the "new page, move to column 1" option might be deleted in a future edition of ASCII.
			Constant in Visual Basic and VB.NET: `vbFormFeed, FormFeed`
13	$0D	CR	Carriage Return — FE₅ Format effector 5	015	0/13	CR
	`\r`	`^M`	Traditional definition: Moves to the first position on the same line (ASCII, ISO 646, ECMA-6). Newer definition: Moves to the line home position or line limit position of the same line (ISO 6429, ECMA-48).
			Note: The standard meaning of CR is "move to beginning of current line". This allows overprinting the line with new characters, which could be used to achieve underlining, for example. For advancing to the next line CR would be followed by LF. On CP/M, DOS, OS/2 and Windows the newline marker is CR LF, which is according to the definition. CR alone has been used as the newline character on some systems, such as Commodore and Apple, which use does not conform to the standards in question. The order CR LF (instead of LF CR) may have been important on mechanical devices where a carriage return took relatively long to execute. A non-printing LF was more suitable output while the printing head was returning, rather than striking a graphic symbol in the middle of the line.
			`Enter` on PC keyboard produces this character code.
			Constant in Visual Basic and VB.NET: `vbCr, Cr`
			See also: LF
14	$0E	SO	Shift Out — LS₁ Locking-Shift One	016	0/14	SO
		`^N`	Used to extend the character set. SO may alter the meaning of the following bit combinations until an SI is reached. Between SI and SO, character positions 33-126 (decimal) may represent additional characters that would not otherwise fit in the regular character set.
			Note: SO (Shift Out) is normal name of this control. LS1 (Locking-Shift One) is used by ECMA-35 and ECMA-48. In those standards, SO is used in 7-bit environments and LS1 in 8-bit environments. The mechanism to select the alternative character set(s) was defined in ANSI X3.41, ISO 2022 and ECMA-35. It includes the use of escape sequences starting with ESC. SO has also been used on printers to select enlarged characters or another color.
15	$0F	SI	Shift In — LS₀ Locking-Shift Zero	017	0/15	SI
		`^O`	Used in conjunction with SO. It may reinstate the standard meanings of the characters following it.
			Note: SI (Shift In) is normal name of this control. LS0 (Locking-Shift Zero) is used by ECMA-35 and ECMA-48. In those standards, SI is used in 7-bit environments and LS0 in 8-bit environments. SI has also been used on printers to select condensed characters or to reset color.
16	$10	DLE	Data Link Escape — TC₇ Transmission control character 7	020	1/0	DL
		`^P`	Used to provide supplementary data transmission control functions. DLE changes the meaning of a limited number of following characters.
			Note: DLE is the "escape" character for transmission control. DLE can potentially be put in the front of a transmission control character (TC1-TC10) to pass it through "as is" instead of controlling the current transmission. This is not always the case, though. It is possible to create new transmission control sequences with DLE in a similar way ESC is used to create escape sequences for other purposes. Contrary to the standards, ^P has been used as a keyboard shortcut to echo console activity at the printer.
17	$11	DC₁	Device Control 1 — XON	021	1/1	D1
		`^Q`	Intended to turn on or start an ancillary device, to restore it to the basic operation mode (see DC2 and DC3), or for any other device control function.
			Note: DC1 is conventionally called XON when used in communication for software flow control. The meaning of XON is to continue data transmission after an XOFF (DC3) has been received. The name XON ("transmit on") does not come from a standard, but it is commonly used.
18	$12	DC₂	Device Control 2	022	1/2	D2
		`^R`	Intended for turning on or starting an ancillary device, set it to a special mode (restored via DC1), or for any other device control function.
19	$13	DC₃	Device Control 3 — XOFF	023	1/3	D3
		`^S`	Intended for turning off or stopping an ancillary device. It may be a secondary level stop such as wait, pause, stand-by or halt (restored via DC1). It can also perform any other device control function.
			Note: DC3 is conventionally called XOFF when used in communication for software flow control. An XOFF is issued to stop transmission when a device cannot accept more data. Transmission can be continued via XON (DC1). The name XOFF ("transmit off") does not come from a standard, but it is commonly used. The use of XOFF and XON is in line with the standards, even though not directly defined in them.
			XOFF (^S) is sometimes used as a pause command. Continuing requires pressing XON (^Q). ^S also works as a pause on MS-DOS and in Windows command prompt. Pressing any key continues.
20	$14	DC₄	Device Control 4 (Stop)	024	1/4	D4
		`^T`	Intended to turn off, stop or interrupt an ancillary device, or for any other device control function.
21	$15	NAK	Negative Acknowledge — TC₈ Transmission control character 8	025	1/5	NK
		`^U`	Negative response. Transmitted from a receiver as a response to the sender.
			Note: NAK can be sent as a response to indicate inability to receive a message, or to request resending.
22	$16	SYN	Synchronous Idle — TC₉ Transmission control character 9	026	1/6	SY
		`^V`	Used as "time-fill" in synchronous transmission. Sent during an idle condition to retain a signal when there are no other characters to send.
			Note: SYN has been used by synchronous modems, which have to send data constantly. — Beginning each transmission with at least two SYN characters is a way to achieve synchronization. The receiving station will possibly ignore SYN, since it doesn't belong to the actual data content.
23	$17	ETB	End of Transmission Block — TC₁₀ Transmission control character 10	027	1/7	EB
		`^W`	Indicates the end of a block of data. Used when data is divided into blocks for transmission.
			Note: ETB, when used to end a block, may call for a reply from a slave station.
24	$18	CAN	Cancel	030	1/8	CN
		`^X`	Indicates that data is in error or should be disregarded. Affects "the data with which it is sent" (ASCII-1968, ASCII-1977) or "the data preceding it" (ASCII-1986, ISO 646, ECMA-6, ECMA-48).
			Note: There are 2 alternative definitions for the data to be disregarded. The actual scope of cancellation is undefined by the standards and should be defined case by case. ^X has been used as a keyboard shortcut to cancel (delete) the characters on the current line, which use conforms to the standards.
25	$19	EM	End of Medium	031	1/9	EM
		`^Y`	Identifies 1) the physical end of a medium, 2) the end of the used portion of a medium, or 3) the end of wanted data on a medium.
			Note: EM may have been suitable for paper tape or magnetic tape to say "no more data". Disk file systems use more sophisticated ways to keep track of the used and unused areas of the medium.
			This character is commonly abbreviated EM, except for Unicode, which provides it as an alias with abbreviation EOM.
26	$1A	SUB	Substitute	032	1/10	SB
		`^Z`	Used in place of an invalid or erroneous character. Introduced by automatic means in cases like a transmission error.
			Note: When SUB is used as a substitution character, the reverse question mark symbol seems quite good as its visual representation. Compare SUB to Unicode U+FFFD REPLACEMENT CHARACTER.
			SUB has often been used contrary to the standards. On CP/M and MS-DOS, it appears as an End-Of-File marker for text files (^Z). On Unix, ^Z is a keyboard signal to interrupt a foreground process.
27	$1B	ESC	Escape	033	1/11	EC
	`\e`	`^[`	The first character of an escape sequence. Provides either supplementary characters or additional control functions. ESC changes the meaning of a limited number of following characters.
			Note: ESC is used to form escape sequences, which perform various control functions or apply additional character sets. ESC can also be used to invoke the C1 control characters on a 7-bit system that only support character positions 0–127.
			On the keyboard, sometimes the `Esc` key indeed produces the ESC control character. In windowing environments, the key typically cancels a dialog or an operation, rather than producing a control character or starting an escape sequence. This kind of an "escape" is not based on the character standards, however. The closest ASCII equivalent for canceling a dialog would be CAN, but since there is no `Can` key on the common keyboards, it can't be used.
			`Esc` on PC keyboard produces this character code.
28	$1C	FS	File Separator — IS₄ Information separator 4	034	1/12	FS
		`^\`	The four information separators (FS, GS, RS and US) are used to separate and qualify data. Each separator has two alternative names: Information Separator Four equals File Separator, Information Separator Three equals Group Separator, Information Separator Two equals Record Separator and Information Separator One equals Unit Separator. The separators can be used either hierarchically or in a non-hierarchical manner. When used hierarchically, the order is US (least inclusive), RS, GS and FS (most inclusive). The content and length of a file, group, record or unit are not specified by the standards.
			FS, when used in a hierarchical order, delimits a data item called a file. It can also delimit anything else.
29	$1D	GS	Group Separator — IS₃ Information separator 3	035	1/13	GS
		`^]`	GS, when used in a hierarchical order, delimits a data item called a group. It can also delimit anything else.
30	$1E	RS	Record Separator — IS₂ Information separator 2	036	1/14	RS
		`^^`	RS, when used in a hierarchical order, delimits a data item called a record. It can also delimit anything else.
31	$1F	US	Unit Separator — IS₁ Information separator 1	037	1/15	US
		`^_`	US, when used in a hierarchical order, delimits a data item called a unit. It can also delimit anything else.
			Note: The information separators were deliberately arranged next to SPACE, which can also be used as an information separator (word separator).
32	$20	SP	Space	040	2/0	SP
			Moves one character position forwards. Space may also have a function equivalent to that of an information separator.
			Note: Space has a dual nature. It can be classified as both a control character and a (non-printing) graphic character. SP is similar to a Format Effector. It can also be used as a fifth Information Separator. Space is sometimes represented by the symbol ƀ or ␢ (b with a stroke) or ␣ (open box). SP does not belong to the C0 set.
			`Spacebar` on PC keyboard produces this character code.
			See also: NBSP
127	$7F	DEL	Delete	177	7/15	DT
		`^?`	Outdated. An ignorable character originally intended for erasing an erroneous or unwanted character in punched tape. In this standard use, DEL wouldn't affect the information content of data, even though it may have affected the information layout and the control of equipment. Standards also allowed DEL to be used as media-fill or time-fill (even though a NUL may be more appropriate).
			Note: DEL is now outdated. It was removed from the latest standards (ECMA-48 in 1991 and ISO 6429 in 1992). The origin of DEL is with perforated paper. On that, DEL was equal to "all holes punched", which is a way to invalidate an erroneous character (rubout). In a sense, DEL is similar to NUL, since both characters mean "nothing". ASCII-1977 suggests the use of DEL as a "time waster" to accommodate mechanical devices where a carriage return takes time to execute. ASCII-1986 recommends NUL as a time waster instead of DEL. DEL does not belong to the C0 set, but is an individual control code.
			`Ctrl+←Backspace` on PC keyboard produces this character code.
			See also: NUL

\x is what you write in a C program to produce the given control character. ^X means you press Ctrl+X to produce the given control character.

C1 control characters

The C1 control characters work in 8-bit environments. These controls come from 3 related standards: ANSI X3.64, ISO 6429 and ECMA-48. All of these characters are also available in Unicode, too. There are three unassigned control characters: PAD, HOP and SGCI. Use was planned for them in a failed draft DIS 10646, but they were not actually standardized or put to use. Despite this, one can find these control characters in various C1 lists online, and also as aliases in later Unicode standards.

†) The 2-character mnemonics for C1 are from RFC 1345. They are not standardized.

Dec	Hex	Char	Description	Octal	Pos	†)
128	$80	PAD	unassigned, "Padding Character"	200	8/0	PA
		`ESC @`	A reserved control code. Intended for use as PAD Padding Character in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
			Note: Not part of ISO/IEC 6429 or ECMA-48.
			Unicode lists this character as XXX and provides PAD as an alias.
129	$81	HOP	unassigned, "High Octet Preset"	201	8/1	HO
		`ESC A`	A reserved control code. Intended for use as HOP High Octet Preset in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
			Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.
			Unicode lists this character as XXX and provides HOP as an alias.
130	$82	BPH	Break Permitted Here	202	8/2	BH
		`ESC B`	A point where a line break may occur.
			Note: Roughly equivalent to a soft hyphen except that the means for indicating a line break is not necessarily a hyphen. Compare to Unicode U+200B ZERO WIDTH SPACE.
131	$83	NBH	No Break Here	203	8/3	NH
		`ESC C`	A point where a line break may not occur.
			Note: Compare to Unicode U+2060 WORD JOINER.
132	$84	IND	Index	204	8/4	IN
		`ESC D`	Moves to the next line keeping the current horizontal position.
			Note: According to ECMA-48:1986, IND was provided for use in those cases where LF was implemented as New Line. IND was deprecated in 1988 and withdrawn in 1992 from ISO/IEC 6429 (1986 and 1991 respectively for ECMA-48).
			See also: LF RI
133	$85	NEL	Next Line	205	8/5	NL
		`ESC E`	Moves to the first position of the next line. Alternatively, to line home or line limit position.
			Note: NEL maps to the control character NL (New Line) in the EBCDIC character set used on IBM mainframes.
			See also: LF
134	$86	SSA	Start of Selected Area	206	8/6	SA
		`ESC F`	Starts a string of character positions whose contents can be transmitted. The string ends at ESA (or end of display).
135	$87	ESA	End of Selected Area	207	8/7	ES
		`ESC G`	Ends a string of character positions (started by SSA) whose contents can be transmitted.
136	$88	HTS	Horizontal Tabulation Set, Character Tabulation Set	210	8/8	HS
		`ESC H`	Sets a tab stop at the active position.
			Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTS as Character Tabulation Set.
137	$89	HTJ	Horizontal Tabulation with Justification, Character Tabulation with Justification	211	8/9	HJ
		`ESC I`	Moves text to the following tab stop. The text is what comes after the previous tab stop up to the active position.
			Note: This character has several names. ANSI X3.64 originally called it Horizontal Tabulation with Justify. ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTJ as Character Tabulation with Justification.
138	$8A	VTS	Vertical Tabulation Set, Line Tabulation Set	212	8/10	VS
		`ESC J`	Sets a vertical tab stop at the active line.
			Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed VTS as Line Tabulation Set.
139	$8B	PLD	Partial Line Down, Partial Line Forward	213	8/11	PD
		`ESC K`	Moves down so that following characters will appear as subscripts. Subscripts end at the next PLU.
			Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLD as Partial Line Forward. Sample: text PLD subscript PLU text.
140	$8C	PLU	Partial Line Up, Partial Line Backward	214	8/12	PU
		`ESC L`	Moves up so that following characters will appear as superscripts. Superscripts end at the next PLD.
			Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLU as Partial Line Backward. Sample: text PLU superscript PLD text.
141	$8D	RI	Reverse Index, Reverse Line Feed	215	8/13	RI
		`ESC M`	Moves to the previous line keeping the current horizontal position.
			Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 renamed RI as Reverse Line Feed, apparently related to the removal of IND.
			See also: IND
142	$8E	SS2	Single Shift Two	216	8/14	S2
		`ESC N`	Used to extend the character set. The next character will be from the currently chosen G2 set.
			Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.
143	$8F	SS3	Single Shift Three	217	8/15	S3
		`ESC O`	Used to extend the character set. The next character will be from the currently chosen G3 set.
			Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.
144	$90	DCS	Device Control String	220	9/0	DC
		`ESC P`	Starts a device control string. ST ends the string. The control string may include commands to the receiving device, or a status report from the sending device.
145	$91	PU1	Private Use One	221	9/1	P1
		`ESC Q`	Reserved for private use, no standardized meaning.
146	$92	PU2	Private Use Two	222	9/2	P2
		`ESC R`	Reserved for private use, no standardized meaning.
147	$93	STS	Set Transmit State	223	9/3	TS
		`ESC S`	Notifies that data is ready for transfer from a device (ANSI X3.64), or establishes the transmit state in the receiving device (ISO 6429, ECMA-48). Doesn't initiate the actual transmission.
148	$94	CCH	Cancel Character	224	9/4	CC
		`ESC T`	Ignore the preceding graphic character (and CCH itself too). If the previous character is a control character or sequence, ANSI X3.64 says it should be ignored, while ISO 6429 and ECMA-48 leave the action undefined.
			Note: Destructive backspace. Intended to eliminate ambiguity about the meaning of BS.
			See also: BS
149	$95	MW	Message Waiting	225	9/5	MW
		`ESC U`	Sets a message waiting indicator in the receiving device.
150	$96	SPA	Start of Guarded Protected Area, Start of Protected Area, Start of Guarded Area	226	9/6	SG
		`ESC V`	Starts a string of character positions that can't be altered manually or transmitted. Optionally protects against erasure too. EPA will end the string.
			Note: SPA is known as Start of Protected Area (ANSI X3.64, ECMA-48:1979), Start of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and Start of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).
151	$97	EPA	End of Guarded Protected Area, End of Protected Area, End of Guarded Area	227	9/7	EG
		`ESC W`	Ends the area started by SPA.
			Note: EPA is known as End of Protected Area (ANSI X3.64, ECMA-48:1979), End of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and End of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).
152	$98	SOS	Start of String	230	9/8	SS
		`ESC X`	Starts a control string. The string ends at ST. It cannot contain a SOS. The interpretation of the string depends on the application.
153	$99	SGCI	unassigned, "Single Graphic Character Introducer"	231	9/9	GC
		`ESC Y`	A reserved control code. Intended for use as SGCI Single Graphic Character Introducer in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).
			Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.
			Unicode lists this character as XXX and provides SGC as an alias.
154	$9A	SCI	Single Character Introducer	232	9/10	SC
		`ESC Z`	A reserved control code. The name was standardized as SCI Single Character Introducer, but the actual functionality was not implemented in the standards.
			Note: SCI was to be followed by a single byte, which would represent a control function or a graphic character. The functions or characters were not defined in the standards.
155	$9B	CSI	Control Sequence Introducer	233	9/11	CI
		`ESC [`	Starts a control sequence.
156	$9C	ST	String Terminator	234	9/12	ST
		`ESC \`	Closes a string opened by APC, DCS, OSC, PM or SOS.
157	$9D	OSC	Operating System Command	235	9/13	OC
		`ESC ]`	Starts an operating system control string. The string ends at ST and is interpreted subject to the operating system.
158	$9E	PM	Privacy Message	236	9/14	PM
		`ESC ^`	Starts a privacy message. ST will end the message.
159	$9F	APC	Application Program Command	237	9/15	AC
		`ESC _`	Starts an application program command string. ST will end the command. The interpretation of the command is subject to the program in question.

ESC X means you press Esc followed by X to produce this control character.

ISO 8859 special characters

The two special characters, NBSP and SHY, are not really control characters. They are graphic characters with a special feature. The characters also appear in Unicode. They are included here for the sake of completeness.

‡) The 2-character mnemonics for NBSP and SHY are from RFC 1345. They are not standardized.

Dec	Hex	Char	Description	Octal	Pos	‡)
160	$A0	NBSP	No-Break Space	240	10/0	NS
			A space for use when a line break is to be prevented.
			Note: NBSP can sometimes be produced by pressing `Ctrl+Shift+SPACE`. No universally supported key combination exists.
			In HTML you can write ` ` or ` ` to add a no-break space to a web page.
			See also: SP
173	$AD	SHY	Soft Hyphen	255	10/13	--
			Indicates an intraword break point for use when a word must be broken across lines. The visual rendering either is a hyphen (ISO 8859) or varies (Unicode).
			Note: SHY can sometimes be produced by pressing `Ctrl+-`. No universally supported key combination exists.
			In HTML you can write `` or `` to add a soft hyphen to a web page.

NUL
00

SOH
01

STX
02

ETX
03

EOT
04

ENQ
05

ACK
06

BEL
07

BS
08

HT
09

LF
0A

VT
0B

FF
0C

CR
0D

SO
0E

SI
0F

DLE
10

DC1
11

DC2
12

DC3
13

DC4
14

NAK
15

SYN
16

ETB
17

CAN
18

EM
19

SUB
1A

ESC
1B

FS
1C

GS
1D

RS
1E

US
1F

SP
20

. . . .

DEL
7F

PAD
80

HOP
81

BPH
82

NBH
83

IND
84

NEL
85

SSA
86

ESA
87

HTS
88

HTJ
89

VTS
8A

PLD
8B

PLU
8C

RI
8D

SS2
8E

SS3
8F

DCS
90

PU1
91

PU2
92

STS
93

CCH
94

MW
95

SPA
96

EPA
97

SOS
98

SGCI
99

SCI
9A

CSI
9B

ST
9C

OSC
9D

PM
9E

APC
9F

8859

NBSP
A0

SHY
AD

Delimiter

Introducer

Shift

Format effector

Info separator

Presentation control

Graphic

Area definition

Device control

Transmission control

Misc

Not assigned

Translations

The translated terms are taken from the given standards. Several alternative translations may exist.

English		French		Russian		Spanish	German	Finnish
Standard		Unicode 5.0		GOST 34.301‑91, GOST 34.302.2‑91		T.53, T.51, T.50	DIN 66003	SFS 4017 *
NUL	Null	NUL	nul	ПУС	пусто	nulo	Füllzeichen	tyhjämerkki
SOH	Start of Heading	DET	début d'en-tête	НЗ	начало заголовка	comienzo de encabezamiento	Anfang des Kopfes	otsikon alku
STX	Start of Text	DTX	début de texte	НТ	начало текста	comienzo de texto	Anfang des Textes	tekstin alku
ETX	End of Text	FTX	fin de texte	КТ	конец текста	fin de texto	Ende des Textes	tekstin loppu
EOT	End of Transmission	FTR	fin de transmission	КП	конец передачи	fin de transmisión	Ende der Übertragung	tekstin loppu
ENQ	Enquiry	DEM	demande	КТМ	кто там?	pregunta	Stationsaufforderung	kysely
ACK	Acknowledge	ACC	accusé de réception [positif]	ДА	подтверждение	acuse de recibo	Positive Rückmeldung	kuittaus
BEL	Bell	SON	sonnerie	ЗВ	звонок	timbre	Klingel	äänimerkki
BS	Backspace	EFF	espace arrière	ВШ	возврат на шаг	retroceso	Rückwärtsschritt	peruutus
HT	Horizontal Tabulation	TAB	tabulation horizontale	ГТ	горизонтальная табуляция	tabulación de caracteres	Horizontal-Tabulator	sarakeohjaus
LF	Line Feed	PAL	changement de ligne	ПС	перевод строки	cambio de renglón	Zeilenvorschub	riviaskel
VT	Vertical Tabulation	TAV	tabulation verticale	ВТ	вертикальная табуляция	tabulación vertical	Vertikal-Tabulator	rivitys
FF	Form Feed	SDP	saut de page, page suivante	ПФ	перевод формата	página siguiente	Formularvorschub	sivun vaihto
CR	Carriage Return	RC	retour de chariot	ВК	возврат каретки	retorno del carro	Wagenrücklauf	vaunun palautus
SO	Shift Out	HC	hors code	ВЫХ	выход	cambio-salida	Dauerumschaltung	koodinvaihto
SI	Shift In	EC	en code	ВХ	вход	cambio-entrada	Rückschaltung	koodinpalautus
DLE	Data Link Escape	ÉCT	échappement transmission	AP1	авторегистр один	escape de enlace de datos	Datenübertragungsumschaltung	ohjauskoodin poikkeus
DC₁	Device Control 1	CD1	commande d'appareil un	СУ1	символ устройства один	control de dispositivo uno	Gerätesteuerung 1	laitteen ohjaus 1
DC₂	Device Control 2	CD2	commande d'appareil deux	СУ2	символ устройства два	control de dispositivo dos	Gerätesteuerung 2	laitteen ohjaus 2
DC₃	Device Control 3	CD3	commande d'appareil trois	СУ3	символ устройства три	control de dispositivo tres	Gerätesteuerung 3	laitteen ohjaus 3
DC₄	Device Control 4 (Stop)	CD4	commande d'appareil quatre	СУ4	символ устройства четыре	control de dispositivo cuatro	Gerätesteuerung 4	laitteen ohjaus 4
NAK	Negative Acknowledge	ACN	accusé de réception négatif	НЕТ	отрицание	acuse de recibo negativo	Negative Rückmeldung	kielteinen kuittaus
SYN	Synchronous Idle	SYN	synchronisation	СИН	синхронизация	reposo síncrono	Synchronisierung	tahditus
ETB	End of Transmission Block	FBT	fin de bloc de transmission	КБ	конец блока	fin de bloque de transmisión	Ende des Übertragungsblocks	jaksonsiirron loppu
CAN	Cancel	ANN	annulation	ОТМ	отмена	cancelar	Ungültig	sanoman peruutus
EM	End of Medium	FS	fin de support	КН	конец носителя	fin del medio físico	Ende der Aufzeichnung	tietovälineen loppu
SUB	Substitute	SUB	substitution	ЗС	замена символа	substituto	Substitution	korvike
ESC	Escape	ÉCH	échappement	АР2	авторегистр два	escape	Umschaltung	koodin poikkeus
FS	File Separator	SF	séparateur de fichiers	РФ	разделитель файлов	separador de fichero	Hauptgruppen-Trennung	tiedoston erotusmerkki
GS	Group Separator	SG	séparateur de groupes	РГ	разделитель групп	separador de grupo	Gruppen-Trennung	ryhmän erotusmerkki
RS	Record Separator	SA	séparateur d'enregistrements, séparateur d'articles	РЗ	разделитель записей	separador de registro	Untergruppen-Trennung	tietueiden erotusmerkki
US	Unit Separator	SSA	séparateur de sous-articles	РЭ	разделитель элементов	separador de unidad	Teilgruppen-Trennung	yksikön erotusmerkki
SP	Space	ESP	espace	ПР	пробел	espacio	Zwischenraum	tyhjä
DEL	Delete	SUP	suppression	ЗБ	забой	suprimir	Löschen	merkin poisto
PAD	"Padding Character"		caractère de bourre
HOP	"High Octet Preset"		octet supérieur prédéfini
BPH	Break Permitted Here	API	arrêt permis ici	РПС	разрешение переноса строки	corte permitido aquí
NBH	No Break Here	PAI	aucun arrêt ici	ЗПС	запрет переноса строки	corte no permitido aquí
IND	Index	IND	index	ИНД	индекс
NEL	Next Line	NL	à la ligne	НС	новая строка
SSA	Start of Selected Area	DZS	début de zone sélectionnée	НВО	начало выбранной области
ESA	End of Selected Area	FZS	fin de zone sélectionnée	КВО	конец выбранной области
HTS	Horizontal Tabulation Set	TTH	taquet de tabulateur horizontal	УГТ	установка горизонтальной табуляции
HTJ	Horizontal Tabulation with Justification	THJ	tabulateur horizontal avec justification	ГТВ	горизонтальная табуляция с выключкой
VTS	Vertical Tabulation Set	TTV	taquet de tabulateur vertical	УВТ	установка вертикальной табуляции
PLD	Partial Line Down	IPav	interligne partiel avant	CCB	смещение строки вперед	avance de línea parcial
PLU	Partial Line Up	IPar	interligne partiel arrière	CCH	смещение строки назад	retroceso de línea parcial
RI	Reverse Index	IR	index renversé, interligne inversé	ОПС	обратный перевод строки	cambio de renglón inverso
SS2	Single Shift Two	RU2	remplacement unique deux	ПЕ2	переключатель единичный два	cambio individual dos
SS3	Single Shift Three	RU3	remplacement unique trois	ПЕ3	переключатель единичный три	cambio individual tres
DCS	Device Control String	CCA	chaîne de commande d'appareils	УЦУ	управляющая цепочка устройства
PU1	Private Use One	UP1	usage privé un	ЧИ1	частное использование один
PU2	Private Use Two	UP2	usage privé deux	ЧИ2	частное использование два
STS	Set Transmit State	MMT	mise en mode transmission	УСП	установка состояния передачи
CCH	Cancel Character	ANC	annulation du caractère précédent	OTC	отмена символа
MW	Message Waiting	MES ATT	message en attente	ОС	ожидание сообщения
SPA	Start of Guarded Protected Area	DZP	début de zone protégée	НСО	начало сохраняемой области
EPA	End of Guarded Protected Area	FZP	fin de zone protégée	КСО	конец сохраняемой области
SOS	Start of String	DC	début de chaîne	НЦ	начало цепочки	comienzo de cadena
SGCI	"Single Graphic Character Introducer"		introducteur de caractère graphique unique
SCI	Single Character Introducer	ICU	introducteur de caractère unique	ГЕС	головной символ единичного символа
CSI	Control Sequence Introducer	ISC	introducteur de séquence de commandes	ГУП	головной символ управляющей последовательности	introductor de secuencia de control
ST	String Terminator	FC	fin de chaîne	ТРЦ	терминатор цепочки	terminador de cadena
OSC	Operating System Command	CSE	commande de système d'exploitation	КОС	команда операционной системы
PM	Privacy Message	MP	message privé	ЧС	частное сообщение
APC	Application Program Command	CO PRO	commande de progiciel	КПП	команда прикладной программы
NBSP	No-Break Space	ESP INS	espace insécable		непрерывающий пробел	espacio anticorte		yhdistävä välilyönti *
SHY	Soft Hyphen	CDN	trait d'union conditionnel		гибкий дефис	guión de corte programable		pehmeä tavuviiva *

* Finnish terms marked with an asterisk are not from any standard, but from recommendation Eurooppalaisen merkistön merkkien suomenkieliset nimet.

Character index

ACK	Acknowledge
APC	Application Program Command
BEL	Bell
BPH	Break Permitted Here
BS	Backspace
CAN	Cancel
CCH	Cancel Character
CR	Carriage Return
CSI	Control Sequence Introducer
DC₁	Device Control 1
DC₂	Device Control 2
DC₃	Device Control 3
DC₄	Device Control 4 (Stop)
DCS	Device Control String
DEL	Delete
DLE	Data Link Escape
EM	End of Medium
ENQ	Enquiry
EOT	End of Transmission
EPA	End of Guarded Protected Area
ESA	End of Selected Area
ESC	Escape
ETB	End of Transmission Block
ETX	End of Text
FE₀	Format effector 0 (Backspace)
FE₁	Format effector 1 (Character Tabulation)
FE₂	Format effector 2 (Line Feed)
FE₃	Format effector 3 (Line Tabulation)
FE₄	Format effector 4 (Form Feed)
FE₅	Format effector 5 (Carriage Return)
FF	Form Feed
FS	File Separator
GS	Group Separator
HOP	"High Octet Preset" (unassigned)
HT	Horizontal Tabulation
HTJ	Horizontal Tabulation with Justification
HTS	Horizontal Tabulation Set
IND	Index
IS₁	Information separator 1 (Unit Separator)
IS₂	Information separator 2 (Record Separator)
IS₃	Information separator 3 (Group Separator)
IS₄	Information separator 4 (File Separator)
LF	Line Feed
LS₀	Locking-Shift Zero (Shift In)
LS₁	Locking-Shift One (Shift Out)
MW	Message Waiting

NAK	Negative Acknowledge
NBH	No Break Here
NBSP	No-Break Space
NEL	Next Line
NUL	Null
OSC	Operating System Command
PAD	"Padding Character" (unassigned)
PLD	Partial Line Down
PLU	Partial Line Up
PM	Privacy Message
PU1	Private Use One
PU2	Private Use Two
RI	Reverse Index
RS	Record Separator
SCI	Single Character Introducer
SGCI	"Single Graphic Character Introducer" (unassigned)
SHY	Soft Hyphen
SI	Shift In
SO	Shift Out
SOH	Start of Heading
SOS	Start of String
SP	Space
SPA	Start of Guarded Protected Area
SS2	Single Shift Two
SS3	Single Shift Three
SSA	Start of Selected Area
ST	String Terminator
STS	Set Transmit State
STX	Start of Text
SUB	Substitute
SYN	Synchronous Idle
TC₁	Transmission control character 1 (Start of Heading)
TC₂	Transmission control character 2 (Start of Text)
TC₃	Transmission control character 3 (End of Text)
TC₄	Transmission control character 4 (End of Transmission)
TC₅	Transmission control character 5 (Enquiry)
TC₆	Transmission control character 6 (Acknowledge)
TC₇	Transmission control character 7 (Data Link Escape)
TC₈	Transmission control character 8 (Negative Acknowledge)
TC₉	Transmission control character 9 (Synchronous Idle)
TC₁₀	Transmission control character 10 (End of Transmission Block)
US	Unit Separator
VT	Vertical Tabulation
VTS	Vertical Tabulation Set
XOFF	Device Control 3
XON	Device Control 1

Sources

ASA standard X3.4-1963: American Standard Code for Information Interchange. Note: ASCII-1963.
USAS X3.4-1967: USA Standard Code for Information Interchange. United States of America Standards Institute, New York, USA, 1967. Note: ASCII-1967.
USAS X3.4-1968: USA Standard Code for Information Interchange. Reprinted as NIC 11246 in Feinler & Postel (ed.): Arpanet Protocol Handbook. NIC 7104 Rev. Jan 1978. ADA-052 594. Network Information Center, Menlo Park, California, USA. Note: ASCII-1968.
ANSI X3.4-1977: American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982. Note: ASCII-1977.
ANSI X3.4-1986: Coded Character Sets – 7-bit American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1986. Note: ASCII-1986.
ANSI X3.32-1973: Graphic Representation of the Control Characters of American National Standard Code for Information Interchange. Reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
ANSI X3.64-1979: Additional Controls for Use with American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1979.
Bemer, R.W.: Inside ASCII. Best of Interface Age, Volume 2: General Purpose Software. Oregon, USA (1980). Pages 1–50.
Bies, Lammert: ASCII character map.
Digital Research: An Introduction to CP/M Features and Facilities, version 1.3, 1976.
ECMA-6: 7-bit Coded Character Set, 4th edition 1973, 5th edition 1985.
ECMA-17: Graphic Representation of the Control Characters of the ECMA 7-Bit Coded Character Set for Information Interchange, 1st edition (withdrawn).
ECMA-35: Character Code Structure and Extension Techniques, 6th edition.
ECMA-48: Control Functions for Coded Character Sets, 2nd, 3rd, 4th and 5th edition.
Gerstung, Olaf: Tabellen — Verschiedenes. Bedeutung der Steuerzeichen im ASCII und nach DIN 66003.
GOST 34.301-91: Information technology. 7-bit and 8-bit coded character sets. Control functions – ГОСТ 34.301-91 (ИСО 6429-88) Информационная технология. 7-битные и 8-битные кодированные наборы символов. Управляющие функции.
GOST 34.302.2-91: Information technology. 8-bit single-byte coded graphic character sets. Latin alphabet No. 2 – ГОСТ 34.302-91 (ИСО 8859/2-87) Информационная технология. Наборы 8-битных однобайтовых кодированных графических символов. Латинский алфавит № 2.
Helsingin yliopiston yleisen kielitieteen laitos: Eurooppalaisen merkistön merkkien suomenkieliset nimet, 2. laitos, toukokuu 2004.
ISO / R 646-1967 (E): 6 and 7-bit coded character sets for information processing interchange, 1st edition December 1967. International Organization for Standardization, Switzerland.
ISO 646-1973 (E): 7-bit coded character set for information processing interchange. ISO Standards Handbook 1: Information transfer, 1st edition, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
ISO 646:1991: Information technology – 7-bit coded character set for information processing interchange.
ISO 2022-1973 (E): Code extension techniques for use with the ISO 7-bit coded character set. ISO Standards Handbook 1: Information transfer, 1st edition, 1977.
ISO 2047-1975 (E): Information processing – Graphical representations for the control characters of the 7-bit coded character set. ISO Standards Handbook 1: Information transfer, 1st edition, 1977.
ISO/IEC 6429:1992 (E): Information technology – Control functions for coded character sets.
ISO 1745-1975 (E): Information processing – Basic mode control procedures for data communication systems. Reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
ISO/IEC 8859: Information technology – 8-bit single-byte coded graphic character sets. Note: Mostly ISO/IEC 8859-1:1998: 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1.
ISO-IR 001: The set of control characters of the ISO 646. Note: ISO-IR 001 deviates slightly from ISO 646-1973 in wording. DEL missing.
ISO-IR 077: C1 Control Character Set of ISO 6429-1983.
Jennings, Tom: An annotated history of some character codes, revised 29 October, 2004.
RFC 20: ASCII format for Network Interchange. Note: Identical to USAS X3.4-1968 (ASCII-1968). Missing Appendix A–D.
RFC 1345: Character Mnemonics & Character Sets.
SFS 4017: Tietojen vaihdossa käytettävä 7-bittinen koodi – 7-bit coded character set for information processing interchange. Suomen standardisoimisliitto, Helsinki, Finland, 1977.
UIT-T T.50 (04/92): Alfabeto internacional de referencia, (anteriormente alfabeto internacional N.° 5 o IA5) – Tecnología de la información - Juego de caracteres codificado de siete bits para intercambio de información.
UIT-T T.51 (09/92): Juegos de caracteres codificados basados en el alfabeto latino para los servicios de telemática.
UIT-T T.53 (04/94): Funciones de control codificadas mediante caracteres para los servicios telemáticos.
Unicode, Inc.: Unicode 5.0, section française.
Unicode, Inc.: The Unicode Standard, version 9.0.0, 2016.
Unicode, Inc.: Unicode Character Database, NameAliases-9.0.0.txt.
Whistler, Ken: Why Nothing Ever Goes Away (was: Re: Acquiring DIS 10646). Unicode Mail List, 5 Oct 2015.
Wikipedia: ASCII.
Wikipedia: C0 and C1 control codes.
Wikipedia: Control character.
Wikipedia: Newline.
Wikipedia: Software flow control.

Most of the sources have been consulted as of September/October 2011.

Special thanks for help to Douglas A. Kerr, the principal author and editor of the published standards document of the first complete version of ASCII.

Updated in August 2016: Unicode 9.0, CP/M, additional details on PAD, HOP and SGCI.

Updated in November 2022: Source links updated/fixed.

Control characters in ASCII and Unicode
URN:NBN:fi-fe201109235583

Control characters in ASCII and Unicode

Contents

Groups of control characters

Control characters in standards

ASCII control characters

History of ASCII control characters

Current status of ASCII control characters

C1 control characters

History of C1

Current status of C1

ISO 8859 special characters NBSP and SHY

Current status of NBSP and SHY

Control characters in Unicode

From ASCII via ISO to Unicode

Control characters in modern applications

Keyboards and control characters

About the character list

Character list

ASCII control characters (C0)

C1 control characters

ISO 8859 special characters

Categories

Translations

Character index

Sources